Text

GNU Find never ceases to amaze me with its utility.

Yesterday I had to do an emergency restart of mysql in production and the resulting magento report/xxxx files swamped out everything else that I might have wanted to look at.

So specifically, I wanted to delete all the files that were created between a start and end date.

GNU find makes this easy

$ find . -type f -newer 655958024588 ! -newer 627812220533 -ls
  

This instructs find to list (-ls) all files (-type f) that are newer than a file called 655958024588 (-newer) and not newer than 627812220533 (! -newer).

If you do not have two files to act as date range boundaries, you can use touch to create them.

$ touch -t yyyymmddHHMM start_date_file
  $ touch -t yyyymmddHHMM end_date_file
  

Then supply these file names to -newer and ! -newer.

To delete the files we can use -exec.

$ find . -type f -newer 655958024588 ! -newer 627812220533 -exec rm -rf {} \;
  

Here it's the -exec argument does the heavy lifting. {} is a placeholder for the file name (find substitutes '{}' with each found filename) and \; terminates the command sequence (much like it does in regular bash).

Text

Veewee considerably simplifies the process of creating base distribution images for use with Vagrant, but unfortunately you have to choose between using the easy to install gem (which comes with horribly out of date basebox templates), or install the latest version from source, which unfortunately uses rvm in a pretty repugnant way.

So, if you want to use veewee to setup a new amd64 Precise Pangolin basebox for vagrant, you either have to pull the latest veewee sources from github, or download the most recent releases templates and copy them over into your veewee gem folder.

This is where veewee-templates-update steps in, it automates that latter step (downloading and installing just the updated templates) for you.

Installation is simple:

$ gem install veewee-templates-update
  

Then, just run the updater:

$ veewee-templates-update
  > Veewee: /home/aaron/.rvm/gems/ruby-1.9.3-p194/gems/veewee-0.2.3
  > Downloading: https://github.com/jedi4ever/veewee/tarball/master
  > Extracting: CentOS-4.8-i386 CentOS-5.5-i386-netboot CentOS-5.5-x86_64-netboot 
  > ...
  
  $ vagrant basebox define precise-amd64 ubuntu-12.04-server-amd64
  > ...
  > (profit)
  
Text

I'm just going to jot down my experience with my new Macbook Air. It will be rather more a stream of consciousness than structured prose, so my apologies in advance for that. I'll clean it all up later.

Anyway, quite happy with it so far, but like any geek with new kit, I want to know everything about it, and make it dance to my whim.

Two things learned this morning: if you have an old Magsafe (1) powerpack, which I did from my old 13" Mac Pro, and it has higher or equally rated wattage you can use it with your macbook air.

It makes sense, a high rated powersupply can support a lower power rated device, but not vice versa. So a 60 watt Macbook Pro magsafe can power a 45 watt Macbook Air. But a 60 Watt Magsafe can't power a 15" 85 watt macbook pro.

The other thing which I have discovered, is that the thunderbolt port on the newer Airs use the same socket form factor as the mini display port. This means is if you have an old Mini DisplayPort -> DVI adapter lying around, you can reuse it.

Text

So, back from the Linux jungle and sitting in front of a Macbook once again.

My first real job has been to get a decent unix environment up. OSX's BSD utilities don't really cut it. Macports is far and away the best distribution out there.

Once you install coreutils and get ls, find etc it makes sense you will want to change your shell to a modern version of bash (or zsh if that's the way you roll).

Lion ships with bash 3.2 whereas Macports will give you a contemporary version 4.2 Unfortunately it's not as simple as going

$ sudo port install bash
  $ chsh 
    <input /opt/local/bin/bash>
  

I had to do this before, but I'd forgotten there's a trick to changing your shell in OSX to a non-standard location. The file /etc/shells contains a list of valid shells chsh will permit. You need to edit (as root or via sudo) this file and add your macports shells. Once that's done chsh will let you change no problem.

Tags: mac osx shell unix
Text

A quick bit of shell-fu.

To take a column from a MySQL database and quickly output it ready formatted as a Javascript array literal (without any specific escaping) do:

echo 'SELECT column FROM table WHERE some_column = "somevalue"' | mysql -uuser -ppass --silent yourdb | awk -v q="'" '{ print q $0 q }' | paste -s -d ',' | sed 's/(.*)/[\1];/'

The first part of the command is self explanatory, you pipe in a query to mysql, and ask it to give you raw unadorned output. It will return each row for column 'column' from table 'table' as a line of output.

You pipe it to awk and ask it to wrap the values in single quotes. Due to shell escaping with single quotes, you set the q variable to a single quote. Paste then joins all the output lines together separated by commas.

Finally I use sed to wrap the resulting output in Javascript array literal '[' and ']' symbols. Awk or any other tool to concatenation approach would do just fine here too.

Text

In Solr to do query for a date range you use the syntax:

field_name: [Start TO Finish] 
  

You can also use wildcards and specific constants in a logical way e.g:

[NOW TO *] or [* TO *]
  

To search over documents that do not have a value for that date field, e.g. is NULL, you use the syntax:

-field_name: [* TO *].
  

It is hard though, to search for dates that are EITHER NULL OR lie in a specific range.

It would seem logical to specify

date_field:[Start To Finish] OR -date_field: [* TO *] 
  

Unfortunately Solr does not appear to support specifying a field multiple times in this way.

So the trick is to effectively query for everything you do not want, then negate the result.

This approach says select me anything that isn't in the range of this date and is not null. When you invert that result, you get all the documents sit inside the date range or are NULL.

The query to weave this magic is

-(-date_field:[Start TO Finish] AND date_field:[* TO *])
  
Tags: solr search
Text

Magento makes use of design patterns, or at least an interpretation of design patterns. One particularly pernicious one, is Mage::getSingleton().

A Singleton, if you've not heard the term before, was popularised in the Design Patterns book by the Gang of Four (Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides). To be very succinct, a Singleton is a way to ensure there is only ever one instance of a class in an Object Oriented design. To put it in even simpler terms , it is an Object Oriented version of a global variable.

It's used heavily in Magento (in the app/code/core directory, 2261 times in fact!). But anyway, why is it considered harmful? There are a number of arguments for why and why not. Herb Sutter's Once is not enough gives a pretty good (and fun to read) overview of them, or you can read Kenton Varda who looks in-depth at the topic. I generally think though, that in Object Oriented software, you're seeking to create abstractions around complexity. The Singleton is a (too) convenient escape hatch from encapsulation and can lead to the attendant issues you get with global variables.

In the Magento/PHP land a more implementation specific problem with Singletons, is memory consumption. Today I was revisiting a Magento Promotions extension I had written and trying to figure out why it was suddenly obliterating PHP's memory_limit.

This extension basically piggybacks on the existing Promotions/Coupons system, but generates an index of products that match coupon codes, the price before and after the promotion is applied and some other metadata.

In order to determine what products have a promotion associated with them, I run through all the products, and match SalesRule conditions against them. I create a synthetic quote for the products that match and then pump them through the SalesRule validator. This effectively applies the promotion to the product and let us see what the savings are.

It's fairly basic, it doesn't look at multi product combinations but it works well enough for simple cases.

/**
   * Apply pricing rules to a synthetic quote to calculate discounted price
   * 
   * @param string $couponCode
   * @param Mage_Catalog_Model_Product $product
   * @return  float
   */
  public function applyToProduct($couponCode, $product)
  {
      $quote = Mage::getModel('sales/quote');
      $item = Mage::getModel('sales/quote_item')
          ->setQuote($quote)
          ->setProduct($product)
          ->setQty(1)
          ->setBaseDiscountCalculationPrice($product->getPrice())
          ->setDiscountCalculationPrice($product->getPrice());
  
      $validator = Mage::getSingleton('salesrule/validator')
          ->init(1, 1, $couponCode);
  
      $validator->process($item);
  
      return $product->getPrice() - $item->getDiscountAmount();
  }
  

Now when I wrote this code, it seemed sensible to use the validator as a Singleton, after all I only needed one copy of it. It didn't, at the time, seem to make sense to create and then destroy the validator a couple of thousand times during indexing. Indeed when this code was first deployed, everything ran smoothly.

Recently the user of this extension added a whole bunch of sales rules - and this caused that product/salesrule index loop to detonate.

That Singleton Validator, which was written as some sort of optimization, started happily hosing over a gig of ram.

Changing getSingleton() to getModel() took ram usage down from 1100MB to about 80MB.

My suspicion is that PHPs garbage collection wasn't cleaning up adequately after each validation attempt. As the validator is effectively static, it never gives up its references for PHP to clean up. When you use getModel(), the validator loses all its references after each loop. While it means it also has to be constructed after each loop, but that allows PHP to free the memory it is using.

The Singleton is already a controversial pattern these days, but Magento developers should be particularly wary of it's implementation and its scope to hose memory.

Text

Out of the box, sadly, PHPStorm doesn't make nice with the Ubuntu Unity launcher.

Typically I manage PHPStorm by extracting it to /opt and then symlinking the extracted folder to /opt/PhpStorm.

To create a nice launcher for Unity you create a desktop entry under ~/.local/share/applications

$ vim ~/.local/share/applications/jetbrains-phpstorm.desktop

Now paste the following in (adjusted for your own paths)

[Desktop Entry]                                                                  
  Version=4.0.1                                                                   
  Type=Application                                                                
  Name=JetBrains PhpStorm                                                         
  Exec=/opt/PhpStorm/bin/phpstorm.sh %f                                           
  Icon=/opt/PhpStorm/bin/webide.png                                               
  Comment=Develop with pleasure!                                                  
  Categories=Development;IDE;                                                     
  Terminal=false                                                                  
  StartupNotify=true                                                              
  StartupWMClass=jetbrains-phpstorm           
  

Hit your windows or super key and type in 'Php' - you should see the newly created desktop entry there. Once launched and fully started, you can opt to keep PHPStorm locked to the Unity launcher for easy startup.

Text

For about eight years I ran Gentoo Linux before I eventually gave it up, and moved on to Ubuntu. It was remarkable in that it provided a BSD like ports system and let you compile your system from the ground up. It also tended to break, a lot.

Even today, almost all x86 linux distributions can run (in theory) on a 32bit 386 processor. Let's be clear though, while it is a remarkable (and successful) processor, the 386 is an antique. The selling point of Gentoo was that it was one of the few distributions to give you the power to build a system, specifically for modern processors abandoning backwards compatibility for long-obsolete ones.

In the early days you had to start from what was called a (and I don't think they do it this way anymore) 'stage 1' install. A stage 1 install is where all you have is a bootable livecd with a few basic set of tools: a c compiler, a shell and the basic gnu coreutils...and irssi. Enough utilities to allow you to build further C packages. You'd run the command 'emerge system' and it would go off and build gcc, glibc, coreutils etc. Once these were built you'd then rebuild glibc, and gcc with your newly compiled, architecture specific, compiler and c library.

Anyway, this took a long time, particularly on sub gigahertz Pentium 2s and 3s, and Gentoo systems tended to break a lot, and by break, in the absolute best case, I mean merely became un-bootable.

The process to recover the system was pretty much the same as to install it, you had to boot from a livecd, configure your network card, hook up to the network, then chroot to the broken disk. At this point you could try and repair whatever damage you had caused.

These days if you do something silly, like I don't know, try and dist-upgrade from Ubuntu Oneiric to Precise, you can get that true Gentoo feeling (i.e. nothing works and you can't boot the machine).

This happened to me this afternoon and the hard yards done with Gentoo came to the rescue.

Here's how you do it.

Boot up from a livecd (or usbkey), get the network card modules loaded and get a dhcp address. With the Precise Live DVD you can do all of this pretty easily by selecting the 'try without installing' option from the bootloader.

Once you're online, you need to prepare the mount. First step is to mount the root partition somewhere, I typically use /mnt/ so say /mnt/oneiric (it can be whatever). If you're not sure of your partition numbers, your livecd will almost certainly come with fdisk with which you can push 'p' to print the partition table.

$ mount -t ext4 /dev/sda5 /mnt/ubuntu
  

If you have a separate boot partition mount that too.

$ mount -t ext2 /dev/sda1 /mnt/ubuntu/boot
  

Now in order to have a functional chroot, we need the proc, dev and sys subsystems to be mounted onto the chroot. This is the tricky bit.

$ mount -t proc none /mnt/ubuntu/proc
  $ mount -o bind /dev /mnt/ubuntu/dev
  $ mount -o bind /sys /mnt/ubuntu/sys
  

In the case of the sys and dev dirs, we need to reference the exact same mountpoints as the host so we use the -o bind option.

Last thing, we want to have functional network name resolution so we copy over the host's /etc/resolv.conf to /mnt/ubuntu/etc/resolv.conf

Now the chroot is ready

$ chroot /mnt/ubuntu /bin/bash
  $ source /etc/profile
  

The chroot will be pretty much as it would be if you'd booted into it normally with a few exceptions. The kernel and kernel modules will be those of the host. If you need to access some specific hardware you need to set this up on the host.

My busted Precise install was fixed with a simple apt-get update and upgrade, before re-running the grub installer.

Text

I came across a particularly nasty bug in Magento 1.6.2.0 last night where calling Mage::getSingleton('cataloginventory/stock_status')->rebuild() would set all grouped products to be out of stock. This didn't happen in 1.5 -however the cataloginventory status handling changed dramatically between 1.5 and 1.6

Forcing the cataloginventory_stock indexer to re-run fixes the situation but if you want to script the status update of many stock items, you can have a short period where your store's products will be unavailable.

Stepping through the issue I found myself in app/code/core/Mage/Catalog/Model/Resource/Product/Status.php and specifically the getProductStatusMethod()

/**
   * Retrieve Product(s) status for store
   * Return array where key is a product_id, value - status
   *
   * @param array|int $productIds
   * @param int $storeId
   * @return array
   */
  public function getProductStatus($productIds, $storeId = null)
  {
     $statuses = array();
  
     $attribute      = $this->_getProductAttribute('status');
     $attributeTable = $attribute->getBackend()->getTable();
     $adapter        = $this->_getReadAdapter();
  
     if (!is_array($productIds)) {
         $productIds = array($productIds);
     }
  
     if ($storeId === null || $storeId == Mage_Catalog_Model_Abstract::DEFAULT_STORE_ID) {
         $select = $adapter->select()
             ->from($attributeTable, array('entity_id', 'value'))
             ->where('entity_id IN (?)', $productIds)
             ->where('attribute_id = ?', $attribute->getAttributeId())
             ->where('store_id = ?', Mage_Catalog_Model_Abstract::DEFAULT_STORE_ID);
  
         $rows = $adapter->fetchPairs($select);
     } else {
         $valueCheckSql = $adapter->getCheckSql('t2.value_id > 0', 't2.value', 't1.value');
  
         $select = $adapter->select()
             ->from(
                 array('t1' => $attributeTable),
                 array('value' => $valueCheckSql))
             ->joinLeft(
                 array('t2' => $attributeTable),
                 't1.entity_id = t2.entity_id AND t1.attribute_id = t2.attribute_id AND t2.store_id = ' . (int)$storeId,
                 array('t1.entity_id')
             )
             ->where('t1.store_id = ?', Mage_Core_Model_App::ADMIN_STORE_ID)
             ->where('t1.attribute_id = ?', $attribute->getAttributeId())
             ->where('t1.entity_id IN(?)', $productIds);
         $rows = $adapter->fetchPairs($select);
     }
  
     foreach ($productIds as $productId) {
         if (isset($rows[$productId])) {
             $statuses[$productId] = $rows[$productId];
         } else {
             $statuses[$productId] = -1;
         }
     }
  
     return $statuses;
  }
  

This method goes through a list of productIds, and will assign a status id to them, this is typically used on grouped products when determining if all their children stock items are out of stock.

In testing, the status ids were all coming back as -1, i.e. not valid, and so therefore the group was out of stock.

In my code the store id was neither null, nor the default store id, so execution fell through to the else branch. At first I inserted a print_r($select->assemble()) to see the SQL being generated. The SQL was fine and when pasting it into MySQL I got a bunch of valid looking results. Funnily though, the status column was first, and the product id column was second (unlike the if branch, where they were in reverse order). This presents a problem when we reach the fetchPairs() statement.

Zend DB's fetchPairs returns an associative array resultset where column a is the key, and column b is the value. Because the SQL was returning the status column first (i.e. the key value) the result set consisted of just 2 rows (for each unique status code). In order for this code to work as you would expect the entity id (product id) needs to be first in the result set, then it gets used as a key.

The fix is straight forward enough, replace

$select = $adapter->select()
              ->from(
                  array('t1' => $attributeTable),
                  array('value' => $valueCheckSql))
  

with

$select = $adapter->select()
              ->from(
                  array('t1' => $attributeTable),
                  array('entity_id', 'value' => $valueCheckSql))
  

This way the product id is always used as the key in fetchPairs and you get a status result for each product.

Tags: magento bugs
Text

The big idea is messaging. Kay98. It's a quote cited early on in Growing Object Oriented Software Guided by Tests (GOOS), a book that looks at Test Driven Development (TDD) using Mock Objects. This idea of messaging being central to Object Oriented Analysis and Design (OOAD), drives much of what is presented throughout the book.

I think as OOAD has matured over the past decade, the mode of thinking of classes as hierarchical constructs has lost favour. Increasingly OOAD is about managing the collaboration of a large number of small, independent, objects. In such designs the solution to the problem is achieved by the way the developer defines the software's object graph – i.e. its composition. In a design that is focussed on getting the composition of objects just so, communication between them is the important thing, much more so than classification.

In GOOS, TDD is presented as an exercise to first understand and then improve messaging protocols between objects. GOOS demonstrates that Mock Objects are an ideal tool to help discover these protocols . This is unusual, even in 2012, for a lot of developers. Typically I've always used Mock Objects as a Test Stub or Double. A placeholder object to induce some specific behaviour I want to test, or to isolate the unit of code I'm testing. GOOS sees Mocks a different way, as a means of representing roles within a system. The authors say a Mock Object is not a Stub, but instead an interface to some behaviour or role. Mock Roles not Objects, a paper written by Freeman, Pryce, et al. way back in 2004, explains this concept and sets out much of groundwork for GOOS.

GOOS itself is structured into five sections. The first couple of sections (very) briefly introduce the reader to TDD, the basic tenets of Agile development (very heavily influenced by Extreme Programming, XP), testing tools (JUnit, JMock, Hamcrest, Windowlicker) and the authors' OOAD philosophy.

The overwhelming impression I get from Freeman and Pryce's introduction to TDD and OOAD is that they see writing tests as less an exercise in producing a regression catching suite, and more as a design exercise. By writing a failing test up front, you have an immediate, testable, statement of intent as to what the software will do, and just as importantly, what it wont do. Focussing on just a narrow slice of behaviour, as represented by a single test, helps narrow the scope of what needs to be done. The act of satisfying the test, making it pass, focuses the developer's mind on the domain of the problem and forces both the developer and the customer to think of what from the environment impacts the test. Flushing out dependencies, object peers and services early on is a good thing.

Too often in Big Design up Front approaches, you write a ton of code according to a pre-determined (but unchecked) idea of the environment. When you go to hook everything up at the end you can find that, to the horror of your project sponsor (and spouse who wont see you for at least the next week), that, actually nothing hooks up. Or worse, you have misinterpreted what the Sponsor wanted. In a Test Driven Design, if elements of the system are incompatible, you're alerted to it early. If you have gotten the requirements completely wrong the customer can see it straight away. TDD takes 'Fail Fast' to its logical extreme.

The meat of the book is a worked example. The authors describe a fictional auction sniper system that connects to an auction, makes bids on items and either wins or loses. It's a simple example, but complicated enough to run into common issues developers face when developing OO software. Certainly there is little trouble filling out 150 pages as the authors work from an abstract set of stories to concrete code. What is good about their example is the way you see the code transforming in stages as extra features are added. The writers are careful to explain the motivations for the transformations they make and they tie it back to the TDD and OOAD principles they introduce in the first two sections.

Having worked through the Auction Sniper application using Java, JUnit, JMock, Hamcrest and Windowlicker there's a brief recap and the book moves swiftly onto it's fourth and fifth sections.

The fourth section covers 'sustainable TDD', an important and increasingly relevant topic for many developers. The burden of maintaining poorly designed test suites is a drag on developer productivity. Rather than liberating developers to improve the structure of their code, bloated indecipherable test suites become a handbrake. GOOS goes through techniques to keep test suites effective, flexible and importantly - expressive. The concept, that software is about communication, is emphasised across the book. Tests are no different. Tests should express the developers intent and function as the rough and ready documentation of a unit of code. I found the advice around constructing data builders - techniques for creating test data for use in your test cases - particularly valuable.

The final section tackles the really hard stuff, dealing with persistence (and by extension any sort of frameworky data service), asynchrony and lastly concurrency.

I found GOOS easy to read and its chapters are of a length that can be easily read on the train/bus or in short bursts. Physically, the paper is of high quality and the typesetting clear, and easy to read. I really like the images and diagrams which are simple, authentic and aid what is being discussed. I felt like the introductory sections to TDD and particularly the author's OO philosophies were fairly succinct, perhaps too succinct. But as they state from the outset, this is not a book on TDD. And anyway, Meszaros07 should be enough TDD for any human being.

GOOS comes with a particularly outstanding bibliography. Freeman and Pryce's academic backgrounds and broad reading are on display with the depth and quality of their references. As a starting point GOOS, while being a pretty domain specific (Mock Object) text, serves as a wonderful launching pad to further OOAD and TDD reading.

I like that the authors are people who practice what they preach. GOOS is a book for people who write real code in the real world. Sadly there seems too many authors, 'consultants' and coaches these days that talk a lot about programming and programming techniques yet seldom practice it in the wild. GOOS reads like a book borne out of brutal trench warfare with Objects. It is refreshing to read a book detailing a principled design philosophy and practice that has been tested in the dirty unwashed world of Enterprise

I've slowly been pruning back my physical book collection and trying to maintain a library of what I would consider 'classics'. The GOF book, Fowler's Refactoring and PoAEE Books, Beck's XP White Book, the Prag Prog Book, Kernighan and Richie's C Book, SICP and Knuth's The Art of Computer Programming series (and if I can ever get it, Mike Abrash's Black Book). I think, for me, this book fits into that category. It represents a decade of work and thinking, neatly explained by two highly skilled and above all practical developers who were at the heart of it all.

I highly recommend this book to anyone actively practising TDD and also generally to anyone with an interest in Object Oriented Software Design and Practice. While GOOS is a 'Java book' and day to day I program in PHP, the principles, practices and overarching philosophy easily translate.

Follow Steve Freeman @sf105 on Twitter, and read his blog http://www.higherorderlogic.com/. Nat Pryce is @natpryce and blogs at http://www.natpryce.com/.

Also be sure to check out the excellent mailing list http://groups.google.com/group/growing-object-oriented-software.

Text

Twig is a PHP implementation of Jinja2, a python templating engine. Unfortunately there's no specific syntax highlighting support for .twig files in Vim. But that's no real problem as you can use the htmljinja syntax file provided here: http://www.vim.org/scripts/script.php?script_id=1856

To map it to twig, edit vimrc and add:

au BufRead,BufNewFile *.twig set filetype=htmljinja
  
Tags: php twig vim jinja
Text

It can be confusing which file to put certain shell / environment setup information in.

Generally speaking (i.e. not with Mac OSX's Terminal.app) .bash_profile gets sourced only on login. Specifically this means only when you enter your username and password from the console. The .bashrc file is sourced when starting an interactive session, that is, whenever you open up a terminal.

There is some confusion here, when you open up a login shell, such as if you use the su - command or run an explicit login shell sometimes provided by a desktop environment. In these cases the rule applies, a login shell means .bash_profile is sourced first, then .bashrc.

I tend to put environment setup in bash_profile, things like paths and any specific one-time configuration settings that aren't likely to change very much. But it's quite reasonable to just put a source .bashrc in your .bash_profile and then put everything in .bashrc.

Tags: bash
Text

Setup a static block in the admin CMS screens giving your block an identifier. You then use this identifier to declaratively load the block in your template.

Then, to include it in a template (say homepage.phtml):

<?php echo $this->getLayout()->createBlock('cms/block')->setBlockId('indentifer')->toHtml() ?>
  
Text

In PHPUnit it's quite possible to completely mock a class that employs a fluent interface without too much heavy lifting.

  $mock = $this->getMock('Zend_Mail');
    $mock->expects($this->any())
              ->method(new PHPUnit_Framework_Constraint_IsAnything())
              ->will($this->returnSelf());
  

This has the effect of making any method call on your mock object return a reference to itself.