14 - March - 2010

Generating a Twitpic.com Gallery

Post by Mike C

Whilst doing some R&D around Twitter related services I've noticed holes lacking in APIs for a number of services. It's not just myself that groans, support queues, mailing lists and Twitter are full of people grumbling about lacking functionality from the services.

Today it was the turn of photo hosting service Twitpic who provide a great basic API service but lacks any ability to extract all images for a specific user through the API. A minor bonus is that each user of the service gets an RSS feed of their uploaded images, although it only serves the last 20 images - 20 being the number of photos shown per page.

A request to add the ability to return all images for a given screenname has been in Twitpics request queue for over a year. So I wasn't going to hold out for an overnight API change!

The only clear way to get all images a user has uploaded appeared to be via the Twitpic user profile page, and follow the pager at the bottom to retrieve the older images. This was a quick job for HTML page scraping and some regular expressions! An alternative nice solution would have been QueryPath, but that introduced external dependancies for this short example script.

The HTML scraping was achieved thanks to some quick 'n dirty regular expression experiments using the excellent http://www.rubular.com/ site.

The only reason this script is written in PHP is due to my knowledge. The actual method for collecting the data can easily be represented in any other scripting language. The actual process boils down to a very simple work flow:

  • Attempt to grab a users profile page.
  • If a successful page is grabbed, check the value for the total number of photos uploaded.
  • Divide the total photo count by 20 and round the fractional number up to the nearest whole to get the number of photo pages that need scraping.
  • Loop through the pages and grab each pages HTML.
  • Extract the anchor href attribute value thats wrapped around each of the image thumbnails.

In the example code provided I added a

foreach()

at the end of the work flow to render the list of twitpic IDs as HTML so that you can see it working.

 

 

To use the code, change the $screenname variable at the start of the script to the name of the Twitter account you want to get all images for.

 

<?php
/*
 * Twitpic Gallery
 * Produces a list of Twitpic.com images for a given screenname
 *
 * @author Mike Carter <http://twitter.com/buddaboy>
 */

// Change this to grab a Twitter users photos
$screenname = 'buddaboy';

// Initialise Variables
$photo_ids = array();
$matches = array();

// Confirm there is a Twitpic account for the screenname
if ($page = @file_get_contents('http://twitpic.com/photos/' . $screenname)) {

  // How many photos?
  preg_match('/>Photos<\/a><\/div>\s*.*?>(\d*?)<\/div>/', $page, $matches);
  $photo_count = $matches[1];

  // Work out how many pages we ned to gather images from
  $total_pages = ceil($photo_count / 20);

  // Collect all the image ids into an array
  for ($i = 1; $i <= $total_pages; $i++) {
    $photo_ids = array_merge($photo_ids, get_twitpic_photos($screenname, $i));
  }
}

// Render thumbnails of all images for the specified screenname
foreach ($photo_ids as $id) {
  echo '<img src="http://twitpic.com/show/thumb/' . $id . '" />';
}

/*
 * Grab all Twitpic image Ids for a given screenname and page
 */
function get_twitpic_photos($screenname, $page = 1) {
  $page = file_get_contents('http://twitpic.com/photos/' . $screenname . '?page=' . $page);

  $matches = array();
  preg_match_all('/profile\-photo\-img">\s*<a href\="\/(.*?)"><img/', $page, $matches);
  $photo_ids = $matches[1];

  return $photo_ids;
}

 

The script assumes your PHP installation allows reading data from external URLs. Leave a comment if you have suggestions for improving or extending the code. Improvements could include scraping the description along side the ID and date photo posted - as neither of these are available via the Twitpic API.

 

Mike C

Managing Director

12 years of Drupal development wrangling and a background in digital project architecture.

Comments

Hi Mike,

Nice work on the script! Have you thought about putting it up on GitHub so others can fork the script, add new features, etc?

Stephen Corona

You can get the rss feed from there, and that will reduce your code .

ranacse05

No good - as i mentioned above:

A minor bonus is that each user of the service gets an RSS feed of their uploaded images, although it only serves the last 20 images - 20 being the number of photos shown per page.

Mike C

In reply to by ranacse05

Unfortunately, the script does not work. Images are not displayed.

Aleksey

Add new comment

Share this article

Sign up to our newsletter!

Our thoughts

Let's work together

Get in touch and find out how we can empower your organisation.
Back to top