/dev/notes

Nov 07

Git —name-only considered helpful

A number of git commands take the —name-only argument which can help give you an overview of what is going on between two branches, or in a specific commit.

$ git show --name-only <commit>

This will give you a list of affected files in commit

Alternatively if you don’t care what differs in the specific contents between two branches, and only want to see different files you can do

$ git diff master..origin/master --name-only

This will show you the list of files that are different between your local master branch and the remote master branch. Handy if you have just done a git fetch and want to see what’s different before merging or rebasing.

Oct 28

Reset MySQLs Root Password

If for some reason you have forgotten the root password for an existing mysql installation you can recover the account by starting mysqld with the —skip-grant option. This is roughly analogous to starting a Unix system in single user mode.

First thing, shut down the running instance and then restart it directly

$ sudo -u <mysql_user> mysqld_safe --skip-grant-tables --skip-networking

The —skip-networking option is important, as by skipping the grant tables, any user can connect to the running mysqld service, will full permissions.

Once you’ve started the server up, login without a password, and issue an update query to the mysql.user table.

$ mysql -uroot mysql
mysql> UPDATE user SET password=password('newpassword') WHERE User = 'root'

Close down mysqld and restart. You’re good to go.

Oct 15

Rails Should be more Worried about Becoming the OLD PHP

TL;DR Basically it is all Google’s fault.

We’ve seen some pretty epic PHP rants this year, probably the most famous among them are PHP a Fractal of Bad Design, and Jeff Atwood’s latest (in what seems to be a biennial broadside) The PHP Singularity.

The common thread in these rants is incredulity that anyone would, in 2012, write new code in PHP. There’s a lot of reasons why someone might write greenfield PHP code in 2012. But equally (and this is said as a decade long PHP programmer) I have to admit that plenty of the criticisms levelled at PHP are valid. Yet, for the most part they just don’t, particularly, matter.

One criticism that is wholly invalid yet comes up time and time again, is that using PHP intrinsically leads to bad code. I don’t feel this is an inherent trait of PHP itself, so much symptomatic of the popularity and low barrier to entry of PHP. Basically there’s more examples of bad code out there compared to pretty much any other platform because there’s simply more code out there, written by programmers of wildly varying skill. The other problem, is that PHP came into existence as a scratch to a C programmer’s itch. PHP was developed at a time where people still actually wrote web applications in C and where the stateless nature of HTTP was relatively respected. PHP, like the web itself, has moved on dramatically since then.

A modern PHP 5.4 webapp looks about as similar to an early 00s PHP 4 webapp as Scala does to Java. Yet many critics when slating the language appear to be code archeologists, excavating pre-historic practices that went out of favour long-ago.

This can be partially forgiven, because owing to the age of the language there’s plenty of out of date information out there with high rankings in Google. The ubiquitous w3schools is an unfortunate example of bad practices coming well ahead of sites with more modern approaches to solving problems in PHP.

So ‘New PHP’ is very different to ‘Old PHP’. But the ‘Old PHP’ is what most people seem to find when searching in Google and this confuses people.

We see this manifested in blogs like Will Rails become the new PHP. This blog has a number of spectacular shortcomings, most egregiously the author’s horribly naive view of the PHP community, but the interesting one is his ignorance of the power of Google.

There’s plenty of support out there for budding PHP programmers, whether on the web, on forums, or IRC. There are countless, well attended, supported and growing conferences, meetups and the like for PHP programmers. One thing that a community cannot do, is force Google to nuke w3schools’ PageRank. Which means that brilliant efforts like PHP The Right Way get swamped by old, incorrect and at times dangerous dreck.

And what the ‘Will Rails become the new PHP’ author perhaps hasn’t realised, is Rails, at least in the terms he’s trying to couch it, has already become the ‘New PHP’. I am a novice Rails developer, I like to hack around in it as it can be quite fun to spike out solutions. What I’ve been struck by, is the sheer amount of bad advice out there. Advice novices will come across, if they turn to Google for help.

If you’re well versed in a platform you learn through brutal experience what works, what doesn’t and your nose is finely tuned to bullshit. When I read a PHP article I know instinctively if what I am reading is reliable. But with Rails, as a novice, I don’t quite have that sense, beyond my own background experience with other programming languages.

So let’s look at an example. I’ve been working on a dead simple Rails authentication webservice. It listens for HTTP requests for /login, /logout, /session, etc., and emits either XML or JSON in response. I’m using the respond_to method to serve out these responses. Unfortunately what I found is if I request a route that does not exist, I get an HTML error back. This doesn’t make a lot of sense for a webservice that otherwise speaks XML and JSON.

Other global exceptions similarly respond with HTML. I don’t want to wrap every action up in a begin/rescue block and there is certainly no way to intercept router exceptions in actions anyway. So I needed to learn how to catch global exceptions.

In my journey of (Google) discovery I came across this blog post and appeared to hit paydirt. The advice appears to be legit, hell someone in the rails community even featured it in a podcast. So on the face of it this seems good. But that switch statement sure is smelly. Does it, really, need to be this hard?

Now my general purpose programming brain recognised that this code while solving my problem, is not ideal. And why is it not ideal? This one method sure has a lot of responsibility. Method names with plurals in them are usually a code smell. Over time as more specific exceptions need to be handled I would end up with code that is as easy to read as Goethe’s Faust, photocopied and in the original Gothic script (not easy). What we have here is a God Method in training.

Now, what if I am a Ruby/Rails/Programming novice? I have got plenty of other stuff to learn, I’m going to go right here and Cargo Cult this code into my webapp and move on. Just like all the rookie PHP coders do, right?

Well, I didn’t do that. I saw that this rescue_from method was pretty awesome and so I went to the Rails API docs to look it up.

API docs for any language are pretty terse, but what jumped out at me was this line:

“Handlers are inherited. They are searched from right to left, from bottom to top, and up the hierarchy. The handler of the first class for which exception.is_a?(klass) holds true is the one invoked, if any.”

This isn’t great documentation admittedly, but it means basically, if you put rescue_from Exception at the bottom of the list of rescue_from handlers in your application_controller.rb file, then since everything derives from Exception, nothing else will get a look in (Rails will look at the handlers from the bottom up).The author of that helpful blog we found didn’t realise this, and so his solution was needlessly complicated.

What can we learn from this? Well Rails programmers certainly live in a glass house and shouldn’t throw stones is one thing. But on a slightly less trollish note, there is a problem here for all novice programmers that turn to Google to help them solve problems. The answers on Google are usually either wrong, or at best, incomplete. As the web gets older, bad and out of date advice piles up making it much harder for novices to find good advice.

Knocking a language for this phenomenon (or a framework, seriously, whatever) is more than a little ignorant and doesn’t solve the problem. Efforts like PHP The Right Way is how PHP is trying to fix it. If Rails really doesn’t want to be the ‘Old PHP’, they need to realise it’s less to do with languages and platforms, and more about SEO.

Oct 11

Update Git Remote Branches List

Over time, a remote will have branches added and deleted. Your local working snapshot can often get littered with stale, now removed branches.

To see what branches your local repo things exists you do something like this:

$ git branch -rv
> origin/1620-upgrade  2e0cc56 Ignore active local.xml from vc
> origin/HEAD          -> origin/master
> origin/cas-sso       2351be5 Add gateway logiin and logout support
> origin/giveaways   63daf5a Use cms blocks for banner placements
> origin/master        496c975 Merge affiliate module
> origin/newskin      d7220c9 Optimise skin and ui images
> origin/release       496c975 Merge affiliate module

So this is my local Magento git repository, many of the branches here are now defunct and no longer in the remote (i.e. I had previously had used $ git push origin :branch from another host)

To refresh then I need to prune my branches list. The git incantation to do this is

$ git remote prune origin
> Pruning origin
> URL: dev@vcs:git/store.git
* [pruned] origin/1620-upgrade
* [pruned] origin/giveaways
* [pruned] origin/newskin

Looking at the remote branch list again:

$ git branch -rv
> origin/HEAD    -> origin/master
> origin/cas-sso 2351be5 Add gateway logiin and logout support
> origin/master  496c975 Merge affiliate module
> origin/release 496c975 Merge affiliate module

Adding New Magento Cache Types -

magento-quickies:

A simple configuration recipe for adding new cache tags to the Magento backend’s “clear cache” feature.

Sep 03

Disabling Magento’s DB Logs

If you’ve ever been responsible for a busy Magento store, you will inevitably run into issues with the various log_* tables getting too big and caning your database.

In theory the Magento cron subsystem should keep a lid on these tables growing too big, but I avoid using Magento cron, preferring to handle that myself directly via crontab tasks.

The other option is to write your own table cleaning script (or copy one from somewhere), and this will work too. But it’s annoying, if you don’t want this log data, why write it in the first place.

So my solution is to disable it by removing the observer events that perform the logging.

I have this in my local.xml which takes precedence over other nodes in the config and therefore overwrites them. Here, by setting the observer to be the string ‘disabled’, the existing observer event is removed and replaced with something that will never be fired.

Now, you don’t need to worry about periodically cleaning out your database, nor do you need to fear a 3am text message from your production DB servers screaming about the disk being full…

Aug 30

Magento CatalogSearch does not escape Breadcrumbs

Ahh a little WTF to start the morning.

I’m going through some PCI scan results this morning, and in the main it’s going well, but I got a couple XSS hits on our catalogsearch pages. This is odd, I think. I’ve audited these pages, they definitely get routed through magento’s escaping code.

On closer examination it turned out the form was okay, it was via the breadcrumbs, that unescaped input was getting into the wild.

I’m running Mage 1.6.x so this code may look a little different if you’re running 1.7

Take a look at app/code/core/Mage/CatalogSearch/Block/Result.php, and specifically at the prepareLayout() method:

Now if you look at line 11, if breadcrumbs are enabled, unescaped input is happily added ready for output.

$title = $this->__("Search results for: '%s'", $this->helper('catalogsearch')->getQueryText());

This fix is easy, replace line 10 with:

$title = $this->__("Search results for: '%s'", $this->helper('catalogsearch')->getEscapedQueryText());

This is a really neat example of the evils of duplication and where bad programming practice can lead to real world problems. I am speculating, but it seems reasonable to infer that the original programmer got trigger happy with the copy & paste keys. Later, at some point you could imagine another engineer coming in to XSS safe the code fixed one bit, but (and programmers are human) missed the other (exactly the same line), and we end up with an issue like this.

Personally, I patched the file as described above and stuck it in app/code/local/Mage to override the core code pool version.

Aug 17

Build a Chef Gem From Source

I get really frustrated with Ruby packages, they promise so much and when on that special day the moon is aligned with Mars, it all just works, and life is great.

Unfortunately this doesn’t happen very often and when using a stack of Rubygems, you almost always get bitten by something.

My cause for complaint today is Vagrant and Chef, well specifically Chef Solo. Vagrant is fine, it does what you tell it to do, but for most use-cases Chef Solo is the right tool to use for provisioning your virtual server. The Vagrant docs on Chef Solo unfortunately fib you, they say you can use Data Bags with Chef Solo, but by default you cannot.

This is a big deal as many useful Chef recipes make heavy use of Data Bags. Data Bags which let you provide environment specific configuration for your provisioning is not yet supported by the stock Chef Gem (currently version 10.12.0). In order to make use of Data Bags with Chef Solo, you need version 10.14.0 and above. This means building the gem from source.

I use Veewee to build my vagrant base boxes (you should too, it’s awesome!), and you can edit the postinstall.sh file in your box definition folder to build Chef from source, rather than installing it directly via Rubygems.

You can repeat this for your local dev machine, and now you can get Chef Solo cooking up your recipes and happily using data bags.

Aug 11

Remove a Magento Adminhtml Menu Option

If, for whatever reason, you need to remove an entry from the magento admin menu, you have two simple options. Remove it using css, or alternatively, drop the following into a custom module’s adminhtml.xml.

This overrides the core code pool’s adminhtml definition, and puts a dependency on a non-existent module. Effectively, this disabled the menu item because it no longer meets the defined dependency requirements.

As always with any magento configuration / module changes, you may need to clear caches for this to take effect.

Aug 08

Find and Delete Files Between Two Dates

GNU Find never ceases to amaze me with its utility.

Yesterday I had to do an emergency restart of mysql in production and the resulting magento report/xxxx files swamped out everything else that I might have wanted to look at.

So specifically, I wanted to delete all the files that were created between a start and end date.

GNU find makes this easy

$ find . -type f -newer 655958024588 ! -newer 627812220533 -ls

This instructs find to list (-ls) all files (-type f) that are newer than a file called 655958024588 (-newer) and not newer than 627812220533 (! -newer).

If you do not have two files to act as date range boundaries, you can use touch to create them.

$ touch -t yyyymmddHHMM start_date_file
$ touch -t yyyymmddHHMM end_date_file

Then supply these file names to -newer and ! -newer.

To delete the files we can use -exec.

$ find . -type f -newer 655958024588 ! -newer 627812220533 -exec rm -rf {} \;

Here it’s the -exec argument does the heavy lifting. {} is a placeholder for the file name (find substitutes ‘{}’ with each found filename) and \; terminates the command sequence (much like it does in regular bash).

Aug 07

veewee-templates-update - A Hidden Little Gem

Veewee considerably simplifies the process of creating base distribution images for use with Vagrant, but unfortunately you have to choose between using the easy to install gem (which comes with horribly out of date basebox templates), or install the latest version from source, which unfortunately uses rvm in a pretty repugnant way.

So, if you want to use veewee to setup a new amd64 Precise Pangolin basebox for vagrant, you either have to pull the latest veewee sources from github, or download the most recent releases templates and copy them over into your veewee gem folder.

This is where veewee-templates-update steps in, it automates that latter step (downloading and installing just the updated templates) for you.

Installation is simple:

$ gem install veewee-templates-update

Then, just run the updater:

$ veewee-templates-update
> Veewee: /home/aaron/.rvm/gems/ruby-1.9.3-p194/gems/veewee-0.2.3
> Downloading: https://github.com/jedi4ever/veewee/tarball/master
> Extracting: CentOS-4.8-i386 CentOS-5.5-i386-netboot CentOS-5.5-x86_64-netboot 
> ...

$ vagrant basebox define precise-amd64 ubuntu-12.04-server-amd64
> ...
> (profit)

Jun 22

Macbook Air - Everything Old is New

I’m just going to jot down my experience with my new Macbook Air. It will be rather more a stream of consciousness than structured prose, so my apologies in advance for that. I’ll clean it all up later.

Anyway, quite happy with it so far, but like any geek with new kit, I want to know everything about it, and make it dance to my whim.

Two things learned this morning: if you have an old Magsafe (1) powerpack, which I did from my old 13” Mac Pro, and it has higher or equally rated wattage you can use it with your macbook air.

It makes sense, a high rated powersupply can support a lower power rated device, but not vice versa. So a 60 watt Macbook Pro magsafe can power a 45 watt Macbook Air. But a 60 Watt Magsafe can’t power a 15” 85 watt macbook pro.

The other thing which I have discovered, is that the thunderbolt port on the newer Airs use the same socket form factor as the mini display port. This means is if you have an old Mini DisplayPort -> DVI adapter lying around, you can reuse it.

Changing Your Shell in OSX

So, back from the Linux jungle and sitting in front of a Macbook once again.

My first real job has been to get a decent unix environment up. OSX’s BSD utilities don’t really cut it. Macports is far and away the best distribution out there.

Once you install coreutils and get ls, find etc it makes sense you will want to change your shell to a modern version of bash (or zsh if that’s the way you roll).

Lion ships with bash 3.2 whereas Macports will give you a contemporary version 4.2 Unfortunately it’s not as simple as going

$ sudo port install bash
$ chsh 
  <input /opt/local/bin/bash>

I had to do this before, but I’d forgotten there’s a trick to changing your shell in OSX to a non-standard location. The file /etc/shells contains a list of valid shells chsh will permit. You need to edit (as root or via sudo) this file and add your macports shells. Once that’s done chsh will let you change no problem.

Jun 20

Format a Javascript Array Literal from MySQL Output

A quick bit of shell-fu.

To take a column from a MySQL database and quickly output it ready formatted as a Javascript array literal (without any specific escaping) do:

echo ‘SELECT column FROM table WHERE some_column = “somevalue”’ | mysql -uuser -ppass —silent yourdb | awk -v q=”’” ‘{ print q $0 q }’ | paste -s -d ‘,’ | sed ‘s/(.*)/[\1];/’

The first part of the command is self explanatory, you pipe in a query to mysql, and ask it to give you raw unadorned output. It will return each row for column ‘column’ from table ‘table’ as a line of output.

You pipe it to awk and ask it to wrap the values in single quotes. Due to shell escaping with single quotes, you set the q variable to a single quote. Paste then joins all the output lines together separated by commas.

Finally I use sed to wrap the resulting output in Javascript array literal ‘[’ and ‘]’ symbols. Awk or any other tool to concatenation approach would do just fine here too.

Jun 19

Date Range with Null Search in Solr 4.x

In Solr to do query for a date range you use the syntax:

field_name: [Start TO Finish] 

You can also use wildcards and specific constants in a logical way e.g:

[NOW TO *] or [* TO *]

To search over documents that do not have a value for that date field, e.g. is NULL, you use the syntax:

-field_name: [* TO *].

It is hard though, to search for dates that are EITHER NULL OR lie in a specific range.

It would seem logical to specify

date_field:[Start To Finish] OR -date_field: [* TO *] 

Unfortunately Solr does not appear to support specifying a field multiple times in this way.

So the trick is to effectively query for everything you do not want, then negate the result.

This approach says select me anything that isn’t in the range of this date and is not null. When you invert that result, you get all the documents sit inside the date range or are NULL.

The query to weave this magic is

-(-date_field:[Start TO Finish] AND date_field:[* TO *])