the web, mobile technology and location based services as I see it
In: Open Data|Programming
25 Jul 2010In my previous post, I talked about open data and how making the Nigerian postcode data open and more accessible has a wide potential for powering several applications. I’ve received several comments on Facebook with even more examples on how that data could be useful.
In this post, I would share how this extraction was done and how similar extraction scripts or scrapers could be written.
The first step in every scraping project I begin is to understand the HTTP dialog for the website I want to scrape. So I attempt to answer questions like these:
Determining the answers to these questions can be obtained by using tools that enable you view this dialog. I personally like to use Firebug for this task.
After you’ve determined the HTTP dialog, you can then write your script to do the extraction. You can write scrapers in any language provided it has support to retrieve HTTP resources and parse HTML. The parsing aspect of a scraper is usually the most interesting part because a lot of parsing libraries choke when they encounter badly formed HTML.
In the code snippet below, I used BeautifulSoup for parsing the HTML and python’s urllib2 for the HTTP communication.
The code is available on Github and although it changes as more functionality is added, you can view the revision log of the gist to see the history of changes.
For quite a while, the Nigerian postcode system has been operational but when you ask people to fill forms, they fill in 234 or something similar for their postcode. There’s very little information available for postcodes and very little projects even use them. I will attribute it majorly to ignorance.
Why is a postcode system useful? It’s essential for routing and location identification purposes. When someone specifies his street name as Adenuga Str in Lagos, is he referring to the Adenuga Str in Alausa or the one in Oremeji Ifako? With a postcode system, this ambiguity is removed because you can identify which location is being referred to. Another use is for package or mail routing or routing in general. If I need to deliver three packages to locations A, B and C. I can effeciently route those packages if I know that B comes before C and than A along a particular route. I’m talking from a logical perspective. I might not know what other industries use postcode systems to do and that brings me to the case of open data.
The open data movement is one that clamours for and encourages the sharing of data in a raw and as discrete form as possible. Why? The advantages are similar to why web applications have APIs – to enable repurposing or data reuse. If I’m using postcode data to know which hospitals are closest to which postcode locations, another project might use it for something equally or even more useful.
Postcode data (until now) has been something you could only access from a few websites – mostly the Nigerian Postal Service website and a site like AddressDoctor. The utility only enabled individuals to know their postcodes – very little utility I would say. However, with my effort and the efforts and inspiritation from folks like Dipo Fasoro of WangoNet (who’s a strong believer and evangelist of open data) and Kayode Muyibi of Nairahost, we’ve been able to obtain this information from the Nigerian Postal Service website and convert it to a form that can be used by other applications.
Imagine the possibilities – dating websites could use the postcode to identify the location of a person and match their profiles to people in their area or ecommerce websites that could use that to batch delivery of orders. The applications are enormous. Eventually we hope to be able to map center points for each of these postcodes even increasing the utility.
The data as it is, is exactly in the form it was extracted. We’ve identified a number of corrections to be made and we’re crowdsourcing (essentially getting interested fellows from the public) the cleanup of the data.
There’s a lot of data locked up on sites like the Nigerian Postal Service and what we’ve demonstated with the postcode data is that this data can be put to more use if it can be made open and made available in a form that computers can use.
You can access the data that has been put on Google Fusion Tables here.
In: General
9 Feb 2010Well I’ll say a lot and only time will really tell and if you’re wondering what I’m talking about, it’s the recently launched Google Buzz.
Google Buzz is an application that allows you to share web links, photos, videos etc with people who are important to you but that’s only half the story. Google Buzz for Mobile is a location based service that allows users of the application to share messages and pictures that are location-aware.
In my attempt to explain it to my girlfriend, I used this example. Assume I walk into an eatery and I discover that they’re giving a free donut for every purchase above ₦500. I can quickly create a buzz to spread the word. Now because I’m using my phone which has a built-in GPS, it records my location and every other person who sees the buzz can identify the location.
There are countless other use cases. Imagine a community service that allows community members to report accidents using Google Buzz for Mobile. Others might include, crime reporting, reporting lost items, reviews and so on.
I’ve been playing around with it and I’m hoping to see more people come on board to give it a try. If you do not have an Iphone or Android 2.0 compatible phone, your best bet would be to attempt to install Google Maps for Mobile 4.0 by visiting http://mobile.google.com/ on your mobile phone and clicking on Google Maps. If your phone is supported and you can install Google Maps for Mobile version 4.0, then you can enable the Buzz layer and start buzzing.
In: General
9 Jan 2010I’ve been looking for ways to express my sentiments about the recent “blacklisting” of Nigerian citizens traveling to the US by US authorities. This is my response. Enjoy it.
I read something really thought provoking on Seth Godin’s blog today and I thought it necessary to reblog it.
–
I’ve noticed that people who read a lot of blogs and a lot of books also tend to be intellectually curious, thirsty for knowledge, quicker to adopt new ideas and more likely to do important work.
I wonder which comes first, the curiosity or the success?
Sometimes, an auto-generated number isn’t enough and what you really need is a unique identifier. Several people have different techniques for generating their unique identifiers. My favorite has been generating a random number and then hashing it through the md5 hash generator. Here’s an example I was once using:
<?php $unique_identifier = md5(rand(100000, 999999)); ?>
The problem with this is that I have given an allowance for only 899,999 possible values. I didn’t realize my error until I started getting mysql integrity check errors for a unique column that stored that value.
I reverted to using a more elegant solution:
<?php $unique_identifier = md5(uniqid(rand(), true)); ?>
The uniqid statement generates a globally unique identifier with a rand() prefix and using much more entropy (true).
In: General
20 Nov 2009I was really excited this morning when I read about the new Google Latitude apps (Location History and Alerts) on the Official Google Mobile Blog.
In case you’re not so conversant with Google Latitude, it’s a location based service that runs on top of Google Maps allowing you to track your geographical location and that of your friends. Your friends will also need to have signed up with Google Latitude for you to be able to track them.
One thing though that I would have loved Google Latitude to have is to alert me when my friends are close by without having to constantly check their location on the application. And the good news is that, it is now possible.
Google Location History is a Google Latitude app that allows you to actually see where you’ve been over time. So you can easily tell all those places you spent the most time in and so on and so forth. With Google Location Alerts, you can get an email or sms alert whenever your friends or colleagues are nearby.
One other feature that would really be cool that’s not out yet will be the ability to tell when any of your friends are at a particular location – say the cinema or a supermarket.
In: Gadgets
19 Nov 2009This is hot! Just got an email from Amazon announcing the availability of a free download of the Kindle for PC. Ok like I just explained to a colleague, the Kindle is a device from Amazon that allows you to carry around and read ebooks that you download from the Amazon store.
The Kindle for PC on the other hand, is a software application you download from Amazon that enables you to read the same ebooks as you would on the Kindle device, on your PC.

Kindle for PC Books List
Why the excitement? Well for one, the Kindle was one of the things I had on my wishlist for the Christmas. Even though I still intend to get the device (or some other ebook reader), this is like a wish granted. May be after all, a netbook will not be a bad idea.
This is coming at a time when the Kindle is facing stiff competition from other new entrants. If you asked me, I think this strategy of creating a standalone PC application makes a lot of sense – at least to me.
I spent last week in the city of Jos where I went to spend some time with my family and also attend my class reunion. I had an idea to organize a seminar where I’ll teach and talk about anything I thought would be cool and constitute new knowledge.
So I went about contacting some of my colleagues who gave me pointers on whom to see and after making a few contacts, I was well on my way to organizing the seminar.
I got a lot of support there and we were able to get about a hundred diploma students at the University of Jos to attend. I was initially thinking about organizing something strictly for web developers and programmers. I had a few ideas about web-based APIs, mashups, code versioning, improving web application performance, etc. but when I was told the caliber of people who were going to be attending the seminar, I decided to stick to something regular computing students will be interested in.
I gave a presentation on Web APIs and Mashups and after the short slideshow, went into building a practical mashup. I was a little concerned about making sure I didn’t speak over the top of their heads and so looked for something very practical to demonstrate a mashup. I ended up building an application that sent a text message to their mobile phones. Afterwards, I went on to talk about the various freely available APIs on the web that can be used to build smart applications and encouraged them to try them out.
There were only a very few individuals there who had done anything for the web but based on the inquiries I received at the end of the seminar, I knew a lot of them had gotten very interested in making further exploration.
I ended the seminar by talking about web mapping and most especially Google Mapmaker and told them about how Plateau State had very little content on the Internet and how that was an opportunity to do something. I’m told that there are plans to build a team that will take what they’ve learnt from the seminar and do something about it.
The success of the event was more than I envisaged and I’m encouraged to give many of these talks and presentations in other parts of the country.
Git is currently my favorite source code versioning tool and while I used Subversion, I knew about something called hooks that I never used.
Essentially, hooks allow you to execute custom scripts when you perform certain actions on your repository like committing files, pulling updates and so on. This is a very useful as you can write hook scripts to (say for example) automatically ftp a file to your web server when a change has been made.
A whole lot of really cool hook scripts have been written and if you use any code versioning tools, you should check out the ones that have been written for the tool you use.
In particular, I find that sometimes developers could check in code that has syntactic bugs. This happens in environments where there are no strict code testing rules. It can be really annoying when you or someone else does this and you have to fix that and then commit again… not professional at all. So I came across this post by Travis Swicegood that lists code that does a php lint on your PHP files before committing them to the repository. PHP lint (php -l) basically checks the syntax of your code and either gives an “ok” or prints the offending line.
For one of the projects I’m working on, I had to change line 11 of Travis’ code to read:
$filename_pattern = '/\.(php|engine|theme|install|inc|module|test)$/';
instead of
$filename_pattern = '/\.php$/';
If you’ve done Drupal coding, you’ll quickly recognize that
Tim Akinbo's Weblog is the personal weblog of Tim Akinbo. Here he discusses issues relating to technology. Special interests include the web, mobile technology and location based services.