Developer? 10x10 is here for you.

If you're an artist or developer interested in information visualization, 10x10 can be a great data resource for you and your work. As an information artist myself, I understand the difficulty of finding interesting and timely data sources on the web. 10x10 hopes to help this problem. Every hour, 10x10 gathers the 100 most important words and pictures in the world, based on what's happening in the news. You are welcome to use the information produced by 10x10 in your own non-commercial projects.

Displaying these pictures in a 10x10 grid, as 10x10 does, is just one application of the data. There are countless other ways to use this data -- analyzing how words and pictures come in and out of the news over time, studying world trends based on what's happening in the news, creating picture-based worldviews -- your imagination is the only limit.

10x10 has been designed to make it easy for developers to use the data it produces. This page explains the basic information architecture of 10x10, and how you can go about using its data.

Information Architecture.

The data 10x10 produces is structured in a series of folders, all online and publically accessible. The folders are named in accordance with their year/month/day/hour, in the following manner:

  • Standard location of a 10x10 data folder for a single hour:
    http://tenbyten.org/Data/global/YYYY/MM/DD/HH/
  • For example, the word and picture data for November 5, 2004 9am would be stored at:
    http://tenbyten.org/Data/global/2004/11/05/09/

Within the folder for each hour, you will find 200 images (each word has a full size image and a thumbnail image) and a wordlist file titled "words.txt". The images are all JPEGS, and are all titled in the following manner:

  • "iraq.jpg" - Full size (227x149 pixels) image for the word "iraq".
  • "iraq2.jpg" - Thumbnail size (60x40 pixels) image for the word "iraq"

To find out the top 100 words for the hour, in ranked order, consult the "words.txt" file in the hour's folder. "words.txt" files have 100 lines, with one word on each line. The #1 (most important) word is on line 1, and the #100 word is on line 100. The lines end with the newline character ("\n"), with no spaces or punctuation. Here is a sample "words.txt" file.

You can easily parse these "words.txt" files, line by line, using any standard scripting language, such as Perl or PHP.

Images of Day, Month, and Year

In addition to gathering data for each hour, 10x10 concludes the top 100 words and pictures for every day, month, and year, based on word and picture popularity in that timeframe. The data for days/months/years follows the same naming conventions as outlined above for hour data. For example, the data folder for the top 100 words/pictures for November 5, 2004 would be located at:
http://tenbyten.org/Data/global/2004/11/05/

The data for the top 100 words/pictures of November, 2004 would be located at:
http://tenbyten.org/Data/global/2004/11/

And the data for the top 100 words/pictures of 2004 would be located at:
http://tenbyten.org/Data/global/2004/

The day/month/year folders use the same "words.txt" ranking system as outlined above.

Obviously, top data for days/months/years is not available until after the given day/month/year has finished.

  • Top DAY data is generated at 12AM of the next day
  • Top MONTH data is generated at 12AM of the first day of the next month
  • Top YEAR data is generated at 12AM of the first day of the first month of the next year

Data for Current Hour

To simplify the process of getting the data for the current hour, 10x10 keeps some relevant current information in the directory: http://tenbyten.org/Data/global/Now/.

In this folder, you will find the following files:

  • "words.txt" -- the current top 100 words, as explained above.
  • "now.jpg" -- a single JPEG of the 10x10 grid for the current hour.
  • "date.jpg" -- the current hour printed as an image.
  • "dateString.txt" -- a one line text file containing the directory link to the current hour (e.g. "2004/11/05/09")

TECHNICAL NOTES:

1) Since 10x10 typically takes around 5-10 minutes to run, data for a given hour is generally not available until 5-10 minutes past the hour.

2) When calling one of 10x10's data directories, you MUST append the trailing backslash, or the Apache server might not recognize the request. For example, to access the data for November, 2004, you must use: "http://tenbyten.org/Data/global/2004/11/", as opposed to "http://tenbyten.org/Data/global/2004/11".

3) All data times and dates are based on Eastern Standard Time (EST).

Attribution.

You are welcome to use the data produced by 10x10, but if you do, please include a link back to 10x10 (http://tenbyten.org) on your site. Thanks!

Please note that 10x10 does not hold the rights to any of the images that appear on this site. The images come from several leading international news sources, and those sources retain all rights to their images. The photographs are used by 10x10 strictly for non-commercial purposes.

Contact.

You can contact Jonathan Harris by mailing: jjh "AT" number27 "DOT" org

Launch 10x10 · About 10x10 · How it Works · Press

10x10™ ©2004 Jonathan Harris | Number27