Spanish blog by a friend of mine – give her some traffic!
Looking for a hosting company that lets me host multiple (smallish) sites, the usual goodies but nothing special (multiple mysql databases, …), and costing significantly less than 500 US$ or 350 UK Pound a year. Recommendations welcome!
What happens if millions of new pictures are posted to the web every day? The pictures could be auto-connected by non-subject metadata like location or timestamp. What is the value of a picture without the story? What will happen to abandoned picture-collecting websites? Is there a standard way to embed metadata about a picture in the picture itself, so it doesn’t get lost?
I like Livia’s new homepage.
Livia (from Brazil) writes a comment to my Racial and Ethnic classifications as an example of classification challenges post: “Not because it’s a bad or a good idea, but how they simply cloned it from the US, even though it makes absolutelly no sense to our population or ethnical background. We don’t have a racial problem in Brazil (we have a wealth distribution problem) and it really upsets me that they are turning it into something it is not.”
Classification systems really have a tendency to stick around, even when they’re no longer useful or just not applicable to the situation (as in Livia’s example).
TeledyN: The End of RSS: can it scale? Sure – if aggregators play nice. Can’t they be forced to play nice? Slashdot already does something like this I believe.
I’d like to see a simple XML format to express timelines with events on them, and a few tools to create this XML, turn them into Flash, HTML and such. The format should allow for merging timelines.
(Just some ideas:) A timeline exists of a StartingPoint, an EndPoint, and Events in between. Each Event is optionally identified by one or more URI’s (for merging), has a required StartingPoint (datetime) and an optional EndPoint (datetime, for events that happen over time). Events can be nested within events. Each event can have a URL (semantics: where to go when user clicks on event), a title, a description and an image.
We should be able to express things like conferences, or a personal timeline in this format.
The format should allow merging: time is universal. The events can be merged based on having at least one URL that’s the same. We should be able to merge your description of a conference (with your blog entries as events for example) with my description of the same conference.
Anyone interested in working out such a format?
Some links about the international coffee trade:
The Campaign to Humanize the Coffee Trade:
– “The world trades more coffee than any commodity except petroleum (and illegal drugs).”
– “Starbucks buys a miniscule amount of its coffee from the Fair Trade systemless than 0.1 of 1 percent of all the beans that Starbucks buys. But, he says, don’t blame the company for that. Smith says the problem is that Fair Trade activists are trying to sell coffee that’s not always very good. He says Starbucks planned to buybut then rejectedsome shipments of Fair Trade coffee last year, because the beans didn’t meet Starbucks’ quality guidelines. […] the company makes virtually the same profit, whether it sells beans stamped “Fair Trade” or not.”
– “‘One needs to choose,’ she says slowly, searching for just the right words. ‘You have only so much time in your life, and so you need to choose your issues. You need to choose the things that you want to be passionate about, the things you want to care about, give your money to, give your attention to.'”
The Campaign to Humanize the Coffee Trade:
– In order to get to that meeting, I just mentioned between Denaux and the farmers – a two-hour meeting – we had to drive for 10 hours over bone-crunching mountain roads. That trip didn’t unearth any scandals. But now the Fair Trade coffee movement has a face.
Global Exchange : Fair Trade Coffee: “To become Fair Trade certified, an importer must meet stringent international criteria; paying a minimum price per pound of $1.26, providing much needed credit to farmers, and providing technical assistance such as help transitioning to organic farming.”
If you want to learn about some of the specific challenges in developing taxonomies, have a look at the racial and ethnic classifications used in the US census. The development, evolution and discussions around this taxonomy highlight many of the problems you can encounter on a smaller scale when developing taxonomies for websites. These problems are inherent to what it means for us to classify. There’s no way around them.
In october 1997, the Office of Management and Budget (in the USA) announced the revised standards for federal data on race and ethnicity. The taxonomy is as follows:
Please choose your race (one or more):
– American Indian or Alaska Native
– Black or African American
– Native Hawaiian or Other Pacific Islander
– Some Other Race
Please choose your ethnicity (only one):
Hispanics can be of any race (so you can choose Black and Hispanic). The Some Other Race category was introduced in the census 2000 questionaires, not originally part of the standard taxonomy.
One could write a book about this taxonomy. I’ll try to keep this short and funky.
In 1977, the taxonomy was like this:
Please choose your race (only one):
– American Indian and Alaskan Native
– Asian and Pacific Islander
Pleace choose your ethnicity:
Back then, the racial categories were considered scientifically valid and mutually exclusive. Obviously, things have changed since.
In 1990, “Other Race” was added, but the biggest change is that people can now choose more than one race. In the 1990 census, half a million people ignored the instructions and checked more than one box. Something had to be done. Imagine being a kid with parents of mixed race.
One result is that data from the 1990 census cannot easily be compared with data from the 2000 census. This is nothing new. Almost every census for the past 200 years has collected racial data different than the one before it, and extracting racial trends is deeply problematic.
Change in taxonomies is something we need to prepare for. It means we will not always be able to effectively compare data over time. It also means we should avoid building the taxonomies we expect to change (and most will) too deeply in the infrastructure of our websites (say, URL’s or database schemes).
Of course, a racial taxonomy is deeply suspect. Scientist these days generally agree that race and ethnicity are social constructions. Humans cannot be categorized in a taxonomy of races based on biological information in a scientifically valid way. However, race continues to be a social reality in the US. It is this social reality that the taxonomy is trying to capture. Since the social reality changes, the categories will continue to change. Are you recognizing any of this in your own work yet?
This is one reason why people are asked to self-categorize. In the past, census enumerators were instructed to report a person’s race based on observation – you can imagine the problems.
Self categorization of course has many problems: people may percieve their choice of race to have some influence on their future (job availability), which can affect their choice. And many people have only limited awareness of their own geneaology – they may not know what race their are supposed to be.
The race categorizations are heavily discussed and disputed every time they are changed. Many political groups argue for or against certain changes in the taxonomy.
The reason is simple: the categories have an impact on policy. If a certain group isn’t categorized in the taxonomy, they can’t be easily measured, and it becomes much harder to lobby for certain changes that should benefit that group. For example, for the 2000 census many advocacy groups for racial minorities encouraged multiracial people to check only a single race (the minority race). Classification is political, and if you’ve ever worked for a large company trying to implement an intranet, you’ll recognize this.
There is much (much!) more to say, and I feel bad for only touching briefly on such a fascinating topic, so here’s some bed-time reading to get you started:
– Recommendations from the Interagency Committee for the Review of the Racial and Ethnic Standards to the Office of Management and Budget Concerning Changes to the Standards for the Classification of Federal Data on Race and Ethnicity.
– Racial and Ethnic Classifications Used in Census 2000 and Beyond
– Using the New Racial Categories in the 2000 Census
I got Joe’s picture as well.
Like a look in the future, but happening today: anti-mega: keitai.
Metadata Generation Research Project:
“The metadata generation research project is developing a model that will facilitate the most efficient and effective means of metadata production by integrating human and automatic processes.”
Short film wows voters in seconds: “A surreal 15-second black movie comedy about an escapologist has won a short film contest.”
<a title="Language Log: like is , like, not really like if you will” href=”http://itre.cis.upenn.edu/~myl/languagelog/archives/000141.html”>Language Log: like is , like, not really like if you will: “like is definitely a more powerful (and useful) expression than if you will. Perhaps that’s why some people use it, like, too much?”
Orange Cone: A photogeoblog sketch: I like these stories that envision how things might work.
Bloug: “So a modest proposal: what if everyone involved in content management–the publications, the web sites, the meetings and conferences–banned CMS vendors for, say, one quarter? No vendor exhibitions at meetings, no product mentions on discussion lists, no CMS purchases, no nothing. Just discussion about all there is to content management besides the technologies.”
Lou is right: the CMS discussion is dominated by the CMS vendors – that needs to change.
Many-to-Many: Otlet: Some ideas die because they are wrong: “The failure of universal subject classification working in concert with the mutable forces of scholarship didn’t happen because that idea fell out of fashion – it was fashionable as recently as 1998, with people being paid fabulous sums of money to pursue it. It failed because it does not work.”
Simon Willison implemented daily links at the top of his blog. I really like the CSS treatment of the visited links: in a dense list like this it makes sense. The strikethrough links are the ones I just visited:
A newborn: InformationScienceTheoryWiki
On the drawingcenter.org, I saw these directions:
If you are traveling by subway, you can take the trains to the Canal Street station.
In the code, the trains are a bunch of image tags. This can be easily made acessible by adding ALT tags, but I wanted to try something more semantic (not sure if it’s useful, just for fun, after my BLOCKQUOTE experiments).
My best take was to use SPAN tags and the letters of the trains, but I couldn’t get the CSS to replace the letter with the image… I’d be thrilled if someone could crack this!
(via Danny Ayers) SchemaWeb – RDF Schemas Directory: “SchemaWeb is a place for developers and designers working with RDF. It provides a comprehensive directory of RDF schemas to be browsed and searched by human agents and also an extensive set of web services to be used by RDF agents and reasoning software applications that wish to obtain real-time schema information whilst processing RDF data.”
I got Tom’s picture as well now!
I found a picture of Dare Obasanjo, still looking for Joe Gregorio, JayT, and Tom Hoffman.
Center For the Ethnography of Every Day Life: “Before the abstractions of social science, there are people’s stories, the emotional worlds of disappointment and uncertainty, and the brave coping of everyday life. Established in 1998 with a grant from the Alfred P. Sloan Foundation, the Center for the Ethnography of Everyday Life fosters research and training to document the challenges of American working families. Working people, everyday lives explored in the tradition where ethnography and documentary come together.”
Hey, so it’s ugly, but that’s all my fault. Really. Thanks to the excellent Bloglines service that I’ve been using lately (it just works for me), I’ve got a blogroll!
Jon is experimenting with automated categorization of blog posts. XML.com: Working with Bayesian Categorizers: “There’s been some discussion in the blog world about using a Bayesian categorizer to enable a person to discriminate along various interest/non-interest axes.”