Discussion about the International Children’s

Discussion about the International Children’s Digital Library on the Sigia-L list: Alfred Werner: “I engaged an expert – my nine year old son. He thought it was interesting enough that he asked me to install it on his computer. There are a few problems with the interface … the loopy back arrow isn’t obvious, moving your selection to the box up top, which you then click to get it to move back to the main ‘action pane’ – also non-intuitive. When my son got lost, he just clicked on the house and drilled back in… Once you play with it – it’s pretty straight forward. I do like the spiral view of the book – it’s just cool. I think for the audience they should add more sound effects – subtle but present. I would like to see the thumbnails slightly larger or clearer – it’s hard to tell what you’re getting without committing to opening a book. ”

Candy Schwartz wrote (not archived yet I believe): “The Suffolk Law Library catalog lets you search and limit by binding colour for many reference books, and the idea of size and colour as book search attributes for catalogues was actually discussed in the mid to late 60s. This is the way people remember books. Also, at least one library catalog (not easily accessible over the Web) has used a completely graphical interface (not as elegant as the newest, but this
was a decade ahead of its time). Want to look for books on romance? Pick the two lovers picture. There are also several catalogues which let you
search for books by attributes other than normal (check out Book Forager)”

Book Forager is indeed another fascinating approach to browsing faceted classification systems where the facets don’t contain topics but values within a range (from “very scary” to “very safe” for example). If the topics where set up to mirror that structure (very to not at all for a certain characteristic), this info could be easily expressed in XFML, although that would mean imposing a semantic limitation that isn’t inherent in XMFL.

What lies beneath: excellent article

What lies beneath: excellent article by Adam Greenfield shedding some light on that thing called “business requirements”. (You know, the one we need to make play nice with user requirements). An important article in the struggle to cross boundaries between disciplines. IA has been doing well in understanding other disciplines, except for branding (making headway there) and business. Many IA’s lack an understanding of branding and business issues like positioning, business strategy or lock-in. We need this: business strategy has a direct and observable influence on the design of products (including websites). When IA’s lack understanding of business issues, that direct link between business goals and design gets severed. The business guy won’t understand design in depth. The designer won’t understand business in depth. IA’s should be able to ask good questions in both domains, thus helping making design accountable .

Distributed Metadata

Vanderwal points to Structured Content: What’s in it for Writers?. Key insight: “Very few people are willing to change the way they work in order to make somebody else’s life easier.”

A similar question for metadata: what’s in it for writers/indexers? I believe this is one of the unsolved issues in the whole metadata field. The answer tool-vendors give is: “The machines will do the work”. I don’t think so. The machines can assist the work of humans, but there is a deep reason why people should do metadata work: without actually working with categories (ie. if you have them generated), you won’t understand/internalize them. The best categories are the ones you create yourself – they are structured the way you work/think. People do index stuff for themselves, but you can only impose limited structure upon that indexing because everyone sees the world differently. So the challenge becomes: how do we use people’s personal indexing so that it becomes usable by others as well? How do we tie in bottom up structuring with top down? The distributed metadata approach has potential there. Unproven potential, but as the saying goes, there’s hoping.

Webgraphics points to ONContent, which

Webgraphics points to ONContent, which provides free syndication feeds. The difference with Newsisfree seems to lie in that ONContent focusses on feeds for web people, including design, techies, IA’s and such, and offers some categorization of these feeds. Also, they don’t search out feeds, rather you sign up to get greater exposure. Their FAQ explains some more. Their sign-up form sucks though – no indication of required fields and when you don’t fill one in on the next page all your entries are erased. Stopped me dead in my tracks.
They display a small text ad with each feed, a clever business idea that I hope will take off (in a respectful way). Related: I find myself browsing a few sites in the morning and then switching to my newsreader.

CI Day 3, part two

The content inventory we are doing uses more categories than the usual ones (ROT and title). I am realising Excell is not a structured dataformat – it lets you enter bad data and generally mess things up, especially if things are to be imported into a database later. I did set up some dropdowns to structure the categories (so you don’t type “redundant” once and “redundent” the next time) but that was it. There is replication of rows and other evil things.

Content Inventory Tip 4: When doing complex categorization, use structured data entry as much as possible. More specific: be careful when ordering columns in Excell. If you order a column alphabetically, the other columns don’t order with it. If you assume they do (as me and others I have talked to did), and order a few columns, you will have messed up your spreadsheet beyond recognition and will need to spend a lot of time fixing it. (Yes I did.) In order to order all the columns, select them all and then select data>sort.

Browsing faceted classification? A childs game!

You can click on as many of the category buttons as you would like. Clicking on more of these buttons will give you a smaller number of books to look at.

This is fucking brilliant. Thanks to Ian Bruk for the pointer. The International Children’s Digital Library (ICDL) is a 5-year research project to develop innovative software and a collection of books that specifically address the needs of children as readers. The interface uses a Faceted Classification combined with a Zoomable Interface (looks like it uses the same engine as Photomesa), and you know what: it works. For kids! I am in awe.

You can try it out (it is a Java app) right here: you’ll need Java installed on your machine (it’s probably there). What I want now is for them to import XFML. Imagine the possibilities.

They have a video (link to download page – video is 24Megs) about the making of ICDL. A must see if you have the bandwidth – it’s really good: “I actually like the French one, because if it was in French and English it could teach you some words in French.” – “Sometimes I read the book over again – sometimes I figure: this has to be a happy or a sad book”. What a great team! Participatory design example: (member of the kids team) “Bug Bug, I found a Bug!”.

Content Inventory Day Three

Content Inventory Tip 1: If you have content on a wide variety of sites, order your URL column alphabetically to get an overview of which pages are on the same (sub)sites.
Content Inventory Tip 2: Take regular breaks.
Content Inventory Tip 3: before starting a detailed content inventory (bottom up), do an initial CI exercise and some top-down work: get a good overview of what is there and try to generalize some rules about ROT.

We need some sense in

We need some sense in the naming of XML feed buttons. I have seen buttons called “XML” (a de facto standard, I guess it is too late to change this), “RDF” and “FOAF” (and “XFML”). Rule: the button should name the feed standard (“FOAF” or “XFML”), not the language it’s expressed in (“RDF” or “XML”).

Inbox Buddy: surely a useful

Inbox Buddy: surely a useful product but with the typical branding mistakes that come with products developed by techies only. Two taglines, neither of which makes much sense (one says “you hate email and now you’ll hate it less”, the other one says “this is such a cool product technology wise”), a confusing value proposition (and a badly defined target audience it feels like), stock images. As I said, I’m sure the product itself is good though.

Content Inventory day 2

Content inventory, 3828 pages, a lot of them PDF files. I am accessing them through a VPN and entering the categories for the CI in excell. Yesterday I spent the morning setting up the excell spreadsheet – deciding on the categories, I now have a dropdown in the “content comments” column with 13 items: “not applicable”, “could not access”, “ok”, title”, “outdated” and so on. There is a column next to that for written comments.

I spent the afternoon doing the first 123 items, with help from the person familiar with the content who explained me a lot about it. The content inventory is more detailed than what I understand people usually do. Today I am working from home and hoping to get a lot done. First problem: getting the VPN client to work.

Bill Kearney explains XFML better

Bill Kearney explains XFML better than I do: “If they wanted to get an ‘overall picture’ they’d benefit from using something like XFML. With XFML it’s possible to deliver a pretty large file that contained the topic framework and associations of the items. This would, essentially, be ALL items from the site. Although, one could consider using a dynamic XFML generator that constrained the data to within certain ranges (like
by year) but that’s a side-issue. Rather than have the XFML contain content it just contains the topics, titles and URL of the actual items themselves. This way if someone wants to find out what items exist in within a topic they don’t
have to crawl the site looking for it. They can pull the XFML and THEN decide which items to read.


This blog takes up about 8% of the total amount of visits (36000 visits a month) to poorbuthappy.com (which contains other websites of mine like the Colombia website, of which the discussion pages take over 30% of the hits to this domain). Search engines bring about 15.000 referrers a month (mostly Google, October 2002 stats), news aggregators (RSS) bring about 1000 referrers a month (and growing). I removed all pictures from my Colombia site beginning this year and optimized my pages because I hit my bandwidth limit (3 gig a month), but now I am almost hitting it again. The ease blog is responsible for half a gig a month. I don’t want to make the page shorter because I think that reduces the value (I like long blogs). Some optimization should keep extra costs off for another few months, but then what? I will have to pay about $450 a year extra. I am looking for cheaper alternatives and bandwidth optimization techniques.

Burningbird: The White Shoes of

Burningbird: The White Shoes of Technology: (via Dave) “This week, the RDF Working Group released drafts of six working documents for the RDF specification. Six. That’s a whole lot of work. However, rather than getting a pat on the back with a quiet “Well done.”, the group has seen their effort catechised mercilessly.”

Opening discussion about your efforts seems to, again and again, invite some people to criticize it rather mercilessly, seemingly from a “I wasn’t invited to your party so now I’ll crash it” point of view. I luckily haven’t had that problem yet with XFML (knock wood), but it seems something I’ve been seeing a lot lately.

Bill Kearney: “Most data is

Bill Kearney: “Most data is currently not being shared with any sort of metadata applied, let alone smart stuff like XTM or XFML. Having a starting point will help make the value of metadata obvious. It’s that nasty chicken-and-egg sort of problem. And here we’re sort of arguing over what kind of chicken to use and we’ve got no eggs (and the users just want breakfast).
[…] the interim period of chaos really puts the technology to the test.”