www2004: themes
themes at www2004: semantic web, learning/information extraction, search.
One big theme at www2004 was the semantic web. As I've indicated earlier, I'm skeptical, or at least unsure about how it might develop and be taken up by people other than wonky researchers. But it was definitely pervasive at the conference, appearing prominently in Tim Berners-Lee's keynotes and essentially representing a paper track throughout. Topics ranged from practical application-focused stuff like semantic email, semantic browsers, semantic search, etc. to more esoteric papers about languages, specifications, and theories. I didn't catch enough of this stuff to really know what the state of the art is, but it's possible that tools will begin to filter out to the general public in a significant way. I do buy that we'll see certain communities using it, e.g. scientific communities that are sharing lots of data.
Another theme was various approaches to learning, classification, and information extraction. There was perhaps more focus on unsupervised techniques than in the past (which I find encouraging; my bias in many real-world applications is against any requirement of human labeling). Approaches to classifying web pages included comparing to a pre-specified topic hierarchy (e.g. from Yahoo!), analyzing page structure (i.e. for the visual/structure cues people use to find headlines and information), and breaking a page into component "blocks" and determining the importance of each block. Approaches to extracting information from the web included various ways of (shallowly) analyzing sentence structure to determine facts, such as looking for indicative phrases, learning probabilistic phrase structures, and figuring out entities and types of facts about them.
Search was of course on everybody's mind. Aside from scoring mentions in the invited talks, papers explored everything from link graph analysis to combining full-text search with database-type search to other stuff I didn't get to see. Of course, the impending (or at least much-hyped) Google-Microsoft-Yahoo battle over search was a constant subtext.
Of course, WWW is a widely diverse conference; I focused on a few particular topics, and I'm sure other people would report something totally different as the themes of the conference.