Monday, December 15, 2008

OpenX Feeling Bullish - Ad Server Company Announces Strong Growth

OpenX, an open source ad server
for web publishers, released statistics today to show its strong recent
growth - especially in the last 6 months. We interviewed the CEO of
OpenX, Tim Cadogan, about the data. We also wanted to know how OpenX
compares with Google's competitor Ad Manager, and we discovered how
exactly OpenX will make money.

According to OpenX, as of December 2008 more than 300 billion ad
impressions now run through its software every month. Its core product
is still the open source OpenX Ad Server - version 2.6 was launched
August 2008 and included a new API. This product has had more than
10,000 active downloads and is getting a 25 billion monthly ad
impression run rate.

Article Link

OpenX Shows Impressive Growth, Ramps Up Revenue Streams

OpenX (which used to be called Openads),
provider of an open-source ad serving solution for web publishers - we
use it at TechCrunch -, is growing like weed under the leadership of
former AOL CEO Jonathan Miller, who is the company’s chairman, and ex-Yahoo executive Tim Cadogan
who is CEO. According to the company, they’re serving well over 300
billion ad impressions through its software as of this month, while its
Hosted product line has achieved a more than 1 billion monthly ad impression run rate.

Article Link

Thursday, August 14, 2008

Freebase Parallax Taunts Us With Awesome Semantic Web Video

Staff researcher David François Huynh has created an interesting tool for browsing semantic database Freebase, called Freebase Parallax. Written up by ZDNet's Oliver Marks, the video Huynh recorded demonstrating Parallax (below) will knock your socks off.

Unfortunately, actually using Parallax demonstrates just how far
from solid Freebase, one of the semantic web's poster children, really
is. The idea is to allow you to apply multiple filters for your
searches and embed live charts in a blog. It's a beautiful idea, check
out the video.

Here's the video below, if you find yourself saying "get to the point already," then skip to about 1:30 in the timeline.

Freebase Parallax: A new way to browse and explore data from David Huynh on Vimeo.

Unfortunately, when we tried out a number of searches in Parallax,
very few subjects were well populated at all. We found duplicate
subject titles where one held solid data and the other didn't, but even
that was a best case scenario. In search after search, we found next to
nothing in Freebase.

The example above is nice, but let's say I want to find out
something about black women scientists. No luck. History of the
internet? Not much information there. Venture Capitalists? Blank
profile pages.

This ought to work. Freebase has taken more than
$50 million in venture investments, they have a small army of volunteer
and computer scientist contributors, they've got robots pumping their
database with information automatically. There are now 60% more articles in Freebase than there are in English Wikipedia. So what's the problem?

We wrote last week about ontological concerns about the semantic web,
but Parallax shows that there are more superficial problems. An
unfriendly UI has been Freebase's excuse for a long time, despite
recent improvements to it. We love the idea of the semantic web, but
give it's grand daddy website a usable UI like Parallax and we're left
questioning just how much there really is inside Freebase anyway.

For an alternate view see Alex Iskold's Freebase: Dispelling the Skepticism,
and some fault here may lay in the coolness ratio of the video to the
Parallax app, but for now - we feel inclined to look elsewhere for the
"semantic web killer app."

Article Link

Wednesday, August 13, 2008

Lehman’s Online U.S. Advertising Forecast: Another $20 Billion In Growth By 2012 And Online Video Takes Off

I was (digitally) leafing through the latest Lehman Brothers
Internet Data Book for August this morning, and came across these
forecasts for total U.S. Internet online ad spending and online video
ad spending.

Video ads are the hottest area of growth. Analyst Doug Anmuth thinks
that online video ad spending will reach $1.1 billion this year (up 63
percent), and more than double to $2.4 billion over the next two years.

He also thinks that total advertising spending in the U.S. will go
from $26.1 billion this year to $45.5 billion in 2012 (consequently
increasing from 8.8 percent of total advertising spending to 13.7

Here are some tables with his estimates:

Also, to give some perspective on where online advertising is
compared to TV advertising, he offers this comparison chart of the
first decade of broadcast TV advertising VS. cable TV advertising Vs.
Internet Advertising. The 30 percent growth rate for Internet
advertising is double the rate of where cable advertising was at the
same point in its history, and triple the rate of broadcast TV
advertising. There, don’t you feel better already?

Article Link

Y Combinator To Offer Standardized Funding Legal Docs

Early stage venture firm Y Combinator,
which has funded over 102 young startups, has “open sourced” the legal
documents that they provide to their startups to use as they seek
additional funding beyond what they’ve gotten from Y Combinator. The
documents were created with their law firm, Wilson Sonsini Goodrich
& Rosati and are available here.

The goal, says Y Combinator cofounder Paul Graham,
is to help young startups avoid at least some of the legal costs
associated with that first round of financing. The lawyering fees don’t
vary much based on the size of the round, and so a significant portion
of small rounds can go directly to the lawyers on the deal. Companies
are routinely forced to pay the legal bills of the investors, too,
making the situation worse.

Jeff Clavier, the founder of early stage fund SoftTechVC, told me yesterday that the average legal bills on a deal are $20-$30k. Other angel investors gave estimates in the same range.

The Y Combinator documents are designed to have “terms close to neutral, in the sense that

they favor neither the investor nor the startup,” said Graham in an email.

We’re hoping that this will cause there to be a lot more
startups. I know (because for many years I was one) that there are a
lot of rich technology people who would do angel investing but don’t
because it seems like a schlep. And obviously there are lots of
startups desperate for funding. We’re hoping this document will bring a
lot more of them together.

Is Y Combinator helping their competitors by making the legal process easier? Absolutely. And Graham doesn’t seem to mind.

On a related note, earlier this year TheFunded started allowing entrepreneurs to publish the various term sheet clauses that venture capitalists were asking for.

Article Link

Monday, August 11, 2008

How to Demo your Startup

1.Show product in first 60 sec
2.The best products take less than five minutes to demo
3.Leave people wanting more
4.Talk about what you've done, not what you're going to do
5.Understand your competitive landscape -current and historical
6.Short answers are best
7.Powerpoint bullet slides are death
8.How to use this device call the phone
9.How to handle questions you don't know the to
10. Always confirm the time of your meeting/call, and always be 15 minutes early
Article Link

Monday, August 4, 2008

MySpace Banks On ‘Hypertargeting,’ But Plain Targeted Ads Continue To Suffice

Although MySpace continues to maintain that its hypertargeting
solution will ultimately lead to higher revenues, significant hurdles
remain. A day before the Fox Interactive Media unit’s parent News Corp (NYSE: NWS). releases its Q2 earnings, WSJ outlines the challenges to greater ad spending on MySpace.

-- Engagement: While there’s little question that users of
social net sites like MySpace and its rivals are engaged while they’re
on the site, it is uncertain whether they are too focused on changing
their profile pages and sending messages to friends to pay attention to
ads. And so, basic targeting placements are considered perfectly
adequate for media buyers and their clients’ needs.

-- Privacy: The rumblings from the Federal Trade Commission,
Congress, state lawmakers and digital rights advocates have spooked
some advertisers who don’t want their marketing efforts to turn into an
embarrassing footnote in a legal case. While there wasn’t much backlash
against participants in Facebook’s Beacon ad targeting program last
fall, the initiative still has many marketers and agencies wary of
being associated with the kind of negative publicity it drew.

-- Looking long-term: While FIM has touted the importance of
hypertargeting, executives are taking care not to talk it up too much.
Keeping investors in mind, the company often highlights its $900
million, three-year deal with Google (NSDQ: GOOG)
for featuring sponsored links on MySpace. And ads that take over the
site’s home page are also a source of value. Executives also argue that
the hypertargeting program is beyond being able to charge high ad
rates. Placing the emphasis on the long-term benefits for advertisers
who use the program over time, by offering broader insights into
consumers’ offline behavior beyond the basic targeting on the site.

Article Link

Monday, July 14, 2008

Text Messaging Huge with Young Adults

Text messaging is especially popular with US adults ages 18 to 34, according to Universal McCann's 2008 "Media in Mind" study. Respondents from that age group sent an average of 13 text messages every week.

In last year's survey, nearly one-half of all US adults said
they had never sent a text message. This year only 41% said so. Among
18 to 34 year-olds surveyed this year, just 22% had never sent a text
message, down from the 38% the prior year.

Need data for presentations? eMarketer subscribers can
download charts instantly — over 50,000 choices.

Request Info

"The great unwashed—those people who have never sent a
text message—is getting smaller all the time," said Graeme
Hutton, senior vice president at Universal McCann, in a MediaPost article.

Number of Text Messages Sent per Week according to US Adults, by Age, 2008

Text messaging is still new for many marketers, as evidenced by a February 2008 ExactTarget
study. The proportion of Internet users surveyed who owned a mobile
phone and had made a purchase after receiving a text message was a
paltry 6%. That percentage was higher among younger users, but still
mostly in the single digits.

Such low numbers could be interpreted to mean that text
messages are not a very effective way to market, but they might also
just reflect that such marketing is still relatively rare.

US Internet Users Who Have Purchased due to Receiving Marketing Messages, by Age and Channel, February 2008 (% of respondents in each group)

A December 2007 BIGresearch
study of US Internet users found similarly low text message influence:
6.4% of respondents said they had bought electronics because of text
messages, and still fewer said they had influenced purchases of other
types of goods.

US Adult Internet Users Whose Purchases Are Influenced by Mobile Phone Text Messaging or Video, by Product or Service Category, December 2007 (% of respondents)

If mobile marketing is to move beyond the experimental budget stage,
text messaging is likely to be part of the mix, in part since the
messages lend themselves to existing terminology and benchmarks,
according to John du Pre Gauntt, senior analyst at eMarketer.

"Mobile messaging is tailor-made for getting mobile marketing past the
early adopter stage and into the mainstream," Mr. Gauntt said. "Messaging has a clear currency—that is, messages
sent, received, opened or acted upon—for all parts of the mobile
marketing chain to use."

The eMarketer Mobile Music report will be published this month. Click here to be notified when it is released.

Article Link

Thursday, July 10, 2008

MSFT’s acquisition of Powerset is not about search

The recent $100 million acquisition of Powerset,
the semantic search engine company, by Microsoft looks to be more to do
with advertising than beefing up the software giants’ search

So far, Powerset has only demonstrated its technology against
Wikipedia. It’s a neat exhibit but it is highly computer intensive. You
would need a massive amount of computer power to run that semantic
search technology across the entire Internet.

Also, how many searches would benefit from a semantic component? Not many. Most searches are fairly direct in my experience.

Where Powerset’s technology could prove its usefulness is in
contextual advertising. That’s a much smaller semantic problem to
handle, and it is a semantic problem that would result in the largest

The reason Yahoo is outsourcing some of its advertising to Google is
that GOOG is better at contextual advertising in than Yahoo. GOOG can
monetise those searches better than anybody else–at least for now.

If Microsoft can improve contextual advertising then maybe it can
win that type of business from Yahoo and others. If advertisers through
MSFT get better conversions than through Google then Microsoft wins,

Article Link

Tuesday, July 8, 2008

iRobot Eyes Your Lawn With Their Latest Bot


One of the more useful consumer bots to come out in the last
several years is the Roomba. Who wouldn’t want a robot that sweeps
their floor automatically? It’s a great gadget and all, however, we
need more robots around the house. According to a recent patent filing,
iRobot is eying your lawn with their next creation.

The 84-page patent filing shows us several new designs for a bot
whose sole purpose is to trim your lawn. They seem to be looking at
both an electric and a gasoline hybrid motor for power, and want to
include a variety of features such as an edge trimmer, and the ability
to remember the layout of your lawn for future mowings. What I wouldn’t
have given for one of these as a kid. Then again I’m not sure it would
be quite up to mowing 3 acres of land.

Article Link

Metaweb's Freebase Now 60% Larger Than English Wikipedia

is an incredible monument to human creativity and collaboration, but as
one era of innovation passes into another - semantic web advocates want
to augment the huge human input into the web with machine learning. The
semantically enriched common database Freebase announced today that it will soon reach the milestone of 4 million topics added to its collection.
That's 60% more than English Wikipedia's 2,445,041 articles and almost
half the size of Wikipedia's full 10 million articles in 250 different

What is Freebase? It's a database of information that's organized by
people and machines and is particularly well suited for machine
reading. You're not a machine - so why should you care? Read on.


What You Can Do With Freebase

Semantic web expert and RWW contributor Alex Iskold spelled out the value of Freebase in great detail
here in May. The long and short of it though is that Freebase learns
fast through a combination of automated information harvesting and
machine and human organization. It collects information from sources
like Wikipedia and MusicBrainz and from user uploads and edits.

Programmatic access to that now structured data allows all kinds of
mashups to be built that "know things." Check out, for example:

  • Taught or Not - a cute little game that tests your knowledge of who influenced who throughout the history of thinkers.

  • Shot or Not - another game that tests your knowledge of the causes of death of various famous people throughout history.

  • Random Walk Through Influences - a little app that displays the chain of historical influence around any artist whose name you enter.

  • Pull Quotes - If you have any interest in politics, check this out - it's awesome!

  • Powerset - the Natural Language search engine acquired by Microsoft last week uses Freebase, too.

Seriously, Though

Obviously most of these are relatively frivolous use cases. Are
there serious powerful use cases for Freebase yet? We're not entirely
sure. There are big gaps in the data, which is understandable, but the
interface is so much harder to use than Wikipedia's that there's reason
to be concerned about expectations of substantial human editing. The
interface was much improved this summer and is now far more usable, but
it's still harder than it needs to be.

We've certainly got our questions about Freebase, but we're excited
about what Metaweb is doing with it. They are smart, well funded and
aiming high. The community there deserves congratulations on growing to
4 million reusable articles, something that the the celebrated English
Wikipedia community can only aspire to.

Article Link

Wednesday, July 2, 2008

Microsoft’s Powerset Acquisition: Integration By End Of Year

I spoke with Powerset cofounder/CEO Barney Pell and Microsoft’s Live Search General Program Manager Ramez Naam shortly after Microsoft’s announcement of their acquisition of Powerset earlier today.

Microsoft intends to use Powerset’s natural language search
technology as a major differentiating factor v. no. 1 search player
Google (see our recent coverage of Live Search Cashback, a another Microsoft search effort aimed at getting more market share).

TechCrunchIT goes into detail
on how effective Powerset may be as a weapon. But a few things are
clear - the resource limitations (cash and computing resources) that
slowed Powerset’s development are now history. The relevance problem is
less important since Microsoft core search relevance is quite good. And
users really seem to like the beta launch of Powerset even with the limited dataset.

Naam says 5% of searches contain elements of natural language that
keyword based search algorithms don’t handle well, and there’s an
assumption that as better results are returned, more people may start
to simply type a normal sentence instead of a couple of keywords.
Microsoft will integrate at least parts of Powerset technology into
Microsoft Live Search by the end of the year, Naam says. I expect we’ll
be hearing a lot more about natural language search coming out of
Microsoft shortly.

Article Link

Tuesday, July 1, 2008

Powerset - iPhone Interface

Powerset is a semantic search engine scanning Wikipedia and open
database site, Freebase. The company created an iPhone version of its
site in May after execs realized that a third of its employees owned
iPhones, and they wanted a simpler, faster version of the site for
mobile browsing. Powerset allows users to type in questions such as,
"Who won the NBA Championship in 2008?" Linguistic technology parses
every sentence in a Wikipedia or Freebase entry and provides condensed,
text-based results with links on key search words to the full entries
for further clarification

Monday, June 30, 2008

Online Ads’ Global Share To Break 10 Percent Mark In ‘08; Internet Ads To Rise 26.7 Percent: Zenith

ZenithOptimedia’s optimistic predictions for online is holding
steady. The Publicis Groupe media buyer’s latest forecast expects
global internet ad spend to grow 26.7 percent and break through the
10 percent share barrier this year—a year earlier than Zenith predicted
just three months ago
. By 2010 Zenith predicts online will attract
13.6 percent of all advertising, well ahead of the company’s previous
prediction of 12.3 percent. As for actual dollar amounts, Zenith sees worldwide online ad spend this year of $52.2 billion, $64 billion in 2009 and $78.1 billion in 2010.

The sunny views for online advertising are included in a report that
expects darkening clouds for growth in North America and Europe: Zenith
has downgraded its forecast for the former to 3.5 from 3.7 percent,
while growth for the latter is expected to end up 3.7 percent higher
instead the earlier prediction of 3.9 percent. But thanks to developing
world’s growing ad spend, Zenith has nevertheless upgraded its overall
spending forecast slightly, expecting a 6.6 boost in 2008, up slightly
from the 6.5 percent growth predicted in its March forecast.

The increasing economic uncertainty that Zenith notes is swirling in
Europe and the North America will accelerate the shift to online,
Zenith said. Aside from that, the report cited improved online video
and targeting abilities as further reasons for marketers to shift more
of their budgets to the web. Still, Zenith’s positive predictions for
online ad spend comes after several other, less sanguine industry
reports. Just over a month ago, Lehman Brothers analyst Doug Anmuth said
online ad spending in the U.S. will be up 23 percent, down from his
previous call for 24 percent growth. Meanwhile, display growth has been
trending lower, according to TNS’ look at Q1, when display gained only 8.5 percent compared to the previous year’s doubly-digit quarterly growth rates.

Article Link

Friday, June 27, 2008

Why Web 2.0 Is No Bubble: Corporations Are Willing to Pay for It

seems to want an answer to the question "When will Web 2.0 startups
start making money?" The implication is that unless we can answer the
question, the "bubble" of Web 2.0 will burst and all of us who believe
in this stuff will be revealed as fantasists.

The fact is, it's incredibly hard to make money as a Web 2.0 startup aimed at consumers.

There are hundreds of these companies, and they all clamor to brief
us at Forrester. Each has its own twist on blogs, social networks,
ratings, user generated video, or whatever. It's hard to get people to
pay attention to a new tool, and the value of the tool depends on lots
of participation -- the classic chicken-and-egg problem. Your
competitor is always one twist ahead of you. Some of these startups
will succeed but the odds are one in a thousand -- you need just the
right idea, at the right time, with the right push or set of potential
customers, and you need to take off with such velocity that you leave
the competition in the dust.

Once a startup like this does take off, there's that other
pesky little problem -- monetizing the success. Google transformed the
online world by first generating huge traffic, then finding a business
model. But Google's success was based on a fantastically clever
advertising mechanism that was automated, attracted new advertisers,
and served searchers nearly as well as it served advertisers. Facebook
hasn't yet unlocked that advertising gold mine, and flubbed up its most
prominent try with Beacon. Twitter has no business model yet. Ning has
hundreds of thousands of visitors, but still runs Google AdSense ads. And these are the successes. No wonder people are skeptical.

A few of these companies may (and likely will) unlock that genie as
Google did and take off. But for any given startup, the odds are

The amazing thing is that there are a class of startup companies
making good money right now from Web 2.0. They're not flashy and they
don't grow like mushrooms. But they've got all the business they can
handle and they are growing. I am talking about companies
that serve corporate social application needs. This isn't the typical
Web 2.0 business paradigm, since serving corporate customers means lots
of client service, which is people-intensive -- it doesn't lift off
miraculously like a pure technology startup. In fact, in many of these
companies, the technology itself is positively mundane. But the
startups grow because they deliver value for which they can charge a
premium and get customer loyalty. The customers of these companies
don't defect when something shiny and new comes along, because they
like the service they're getting.

Here are some examples, listed by the objectives they help companies
accomplish (for more on these objectives see Chapters 4 through 9 of Groundswell).

Listening. Communispace
now has hundreds of private communities that its client companies are
using to learn about their customers. It succeeds because it's unlocked
the key to running and moderating these communities effectively, and
grows despite charging $150K or more per year per community. The other
class of listening companies are the brand monitoring companies, and
the track record here is great. Research giant Nielsen bought BuzzMetrics. Another research giant, TNS, bought Cymfony. J.D. Power & Associates bought Umbria. MotiveQuest, which is still independent, has typical clients happily paying $30K and up to work with it.

Talking. Talking with the Groundswell is tricky,
but there are plenty of agencies ready to help you with it. After
building dozens of campaigns and sites, Blast Radius was bought by mega-agency Wunderman. Brains on Fire ignited the spectacular success of Fiskateers. The digital divisions of companies like Edelman also compete in this space, as do the big Web service companies like Avenue A/Razorfish (now part of Microsoft).

Energizing. Ratings and reviews are the easiest way
to energize customers to sell others, and the companies that provide
them are taking off. Bazaarvoice's clients have generated over 10 billion customer reviews. PowerReviews works with over 200 retailers. And ExpoTV has built a business around consumers creating reviews on video.

Supporting. Support forums work -- they please customers and they reduce costs. Lithium
has an impressive client list including Dell, AT&T, Comcast, and
Sprint. The community space is crowded, but other companies with
growing client lists include Jive Software, Awareness, and Mzinga/Prospero.

Embracing. Startups that enable clients to source
ideas from their customers have a bright future, because
customer-generated innovation is hot right now. bought
Crispy News and turned it into Salesforce Ideas, which powers idea sites for Dell and Starbucks. And Innocentive
is growing rapidly, with 50 companies including Procter & Gamble
offering prizes of $10,000 or more to innovators that can solve their

While many were distracted by sparkly consumer-facing startups,
these companies were building and growing solid businesses. Look how
many of them were acquired! This is no bubble, because companies that
deliver business value to clients have durable growth potential. Could
this be the Web 2.0 business model everyone is looking for?

Article Link

Microsoft Acquiring Semantic Searcher Powerset For $100 Million: Report

Well if it can’t get Yahoo’s (NSDQ: YHOO) search business… Microsoft (NSDQ: MSFT) will acquire semantic search engine Powerset for more than $100 million, according to Matt Marshall at VentureBeat.
His exact language is that the company “has agreed to acquire” the
company and that it will be announced next month. SF-based Powerset has
been something of a media darling, despite the fact that it hasn’t
taken off yet. In 2006 it raised a $12.5 million first round from
Foundation Capital and The Founders Fund, as well as various angels,
including Esther Dyson and PayPal founder Luke Nosek. Despite years of
interest in “semantic” or “natural language” search, this area is a
long way from proving that it works much better than current search
technology. VentureBeat also reports that its first round valued the
company at $42.5 million, so this wouldn’t be a huge win for the
investors. But given the uncertainty of this area, and the cash
requirements of an independent search engine, this might’ve looked like
a pretty attractive outcome.

For Microsoft, this deal would be a drop in the bucket—a tuck-in,
really. And by buying what’s basically a technology company, not one
with much market traction, it’s a sign that in the absence of Yahoo, it
still wants to compete with Google (NSDQ: GOOG) by out-engineering it.

Article Link

Microsoft To Buy Powerset? Not Just Yet.

VentureBeat is reporting that Microsoft has agreed to buy semantic search engine Powerset for somewhere around $100 million, which is the price we previously reported was being offered to the company.

Our sources have been saying this deal is highly likely since May,
but hasn’t actually been signed yet and could still be disrupted by the
ongoing Microsoft-Yahoo negotiations. Dave Wehner, a Managing Director at investment bank Allen & Co. (he’s the guy who sold Bebo for $850 million to AOL), is representing Powerset in the deal.

Powerset debuted at TechCrunch40 last fall and opened a showcase of its technology to the public just last month.

Powerset has raised around $12.5 million in venture capital, and is rumored to have taken another $8 million or so in convertible debt as bridge financing.

Article Link

Thursday, June 19, 2008

Learning from Flickr's Co-founders on Their Way Out of Yahoo

In June 2005 Yahoo! acquired upstart Canadian photosharing web site Flickr
and the web hasn't been the same since. Yahoo, on the other hand,
didn't change nearly as much as everyone expected it to. Pre-CEO Jerry
Yang told
then-Business 2.0 writer Erick Schonfeld six months after the deal "I
look at Flickr with envy, it feels like where the Web is going."

Flickr co-founders Caterina Fake and Stewart Butterfield have now cashed out and officially left the company.
Though Yahoo! doesn't appear to have internalized many of the lessons
of Flickr, it's not too late for the rest of us to look at those same
key lessons for inspiration in our work on the web.

Industry Context

There's a lot of photo sharing services on the web, but here's where
Flickr stood. Flickr was the trailblazer, the high-profile media
darling and one of the first major Web 2.0 acquisitions. Webshots
was much older, had been bought and sold for twice as much money but
never embodied the social media ethos the way Flickr did. PhotoBucket
is a year older than Flickr, has always been much larger and was
acquired by Fox for almost 10X Flickr's pricetag in the same week that
Flickr was pegged to replace the entire Yahoo! Photos property.

We've been critical of some of Flickr's strategies around everything
from censorship to data portability, but the big picture is that the
service is fantastic. Even though it wasn't the first and it wasn't
purchased for a particularly large sum (est. $35m) Flickr is still the
beacon of innovation in this sector. Here's why.

Customer Service is The New Marketing

One of the most important elements of Flickr's early success was its
incredible engagement with its users. Flickr management spent what
might have seemed like a totally unreasonable amount of time welcoming
new users to the site, participating actively and promptly in forums
and highlighting the best photos uploaded.

That kind of engagement can turn passing early adopters into ongoing
community stakeholders and advocates. It's something that any startup
could benefit from emulating and a role we're seeing formalized in an
increasing number of companies hiring community liaisons.

The Bleeding Edge Can Go Mainstream

Flickr proved that experimental, bleeding edge web 2.0 features
didn't have to be limited to early adopters. When Flickr brought
geo-tagging, the addition of location data to photo metadata, onto the
site - more than 1 million photos were geotagged in the first 24 hours.
Now that location aware services are heating up, who's in one of the
best positions to serve media up in that environment? Flickr is.

Flickr's APIs have been wildly successful. Mashup and API directory site ProgrammableWeb
lists more mashups using Flickr APIs than any other API on the web,
short of Google Maps. More than Amazon, more than eBay, more than

Flickr's FlickrAuth user authentication API was a key model for the standards based oAuth protocal - now employed by Google's OpenSocial and hopefully soon by countless other applications.

Flickr broke new ground in numerous ways and proved that technical
experimentation didn't have to remain in the early adopter niche.

Being a Freak Will Not Kill Your Business

Butterfield wrote a great letter of resignation,
which was leaked to the bottom feeders at Valleywag but is a great
little read none the less. All parties say it's hardly out of character
and indeed, in my own passing interactions with the man, he was never a
fakely-nice typical business type worried about what might come around
someday from being nasty to any little blogging piss-ant that got in
his way.

Flickr came from Vancouver, British Columbia - in Canada. They must
be the national web 2.0 pride and joy of that freakishly wonderful

The next time someone gives you a hard time for being a freak at work, just cluck at them knowingly and think about Flickr.

Other Lessons

Other people have raised other issues that they think are key to
learn from the situation as well. Flickr power user and exec at rival
startup Zooomr Thomas Hawk offered some obviously heart-felt feelings about what the Flickr story said about acquisition and innovation.

"[They] developed an amazing product. Cashed out (smart).
[They] could have had incredible impact on the future of social search
and innovation at Yahoo but were thwarted by a band of disorganized
bumbling executive idiots who wouldn't recognize talent if it hit them
in the face. Most important opportunities to innovate came under Terry
Semel's watch who was more concerned with being the highest paid CEO in
America than either innovation or shareholder value."

(In response to Hawk's comment, Robert Scoble humorously replied
that Yahoo! "reminds me of Podtech. Had lots of superstars under their
roof and then couldn't listen to them to make things happen.")

Dave Winer told us that the move
makes him concerned about all the data that users have entrusted to
Yahoo! "Whatever emerges from this, the new company should immediately
embark on a program to make users' data portable," Winer said. "Users
have been an abstract thing to Silicon Valley, it would be great if now
that the superstars are leaving Yahoo, the industry could turn to the
users for inspiration, and start to trust them with their own work."

Flickr's handling of user data was generally accepted as a fairly
good work in progress. Now that the original minds behind the company
have left the building, it would be great for the new leaders there to
cement user trust in regards to their data by instituting some formal,
easy-to-use measures for users to make sure their photos are safe and


It would be fantastic to see Fake and Butterfield start something
new but they're certainly due all the relaxation time they want, too.
Once you've got a few million dollars in the bank, though, starting
more internet businesses may be a sign of limited imagination more than
anything else. For the rest of us still plugging away, Flickr offers
some great inspiration.

We're sure there are readers here who have been much more engaged in
the Flickr community than we have. What kinds of business lessons have
you learned from the company?

Wednesday, June 11, 2008

Display Ads Grew 8.5 Percent In Q1, Down From Last Year’s 16.7 Percent Gain: TNS

Display ad dollars in Q1 were way down from last year’s double-digit
growth rates, but the segment still managed to reach a healthy 8.5
percent gain, TNS Media Intelligence reported. In Q107, TNS said that display ads grew 16.7 percent, coming in $2.7 billion. TNS, which does not look at search ad spending, did not release dollar figures this time out.

Still, considering that ad spend overall was essentially flat at 0.6
percent and the tepid growth of segments like cable TV (+4.1 percent)
and outdoor (+2.5 percent) compared to their stellar periods last year,
display isn’t doing too badly. To put display’s Q1 in further context,
consider that network TV expenditures were up 0.8 percent—its best quarterly performance in two full years.
Looking at the year ahead, there was no update one way or the other on
TNS’ January forecast, which predicted display ad revenue growth of
14.4 percent, down slightly from its 2007 tally of 15.9 percent gains. Release

-- Online reach grows 66.6 percent: Since April 2007, the
internet’s reach has grown 66.6 percent, according to a report by
Publicis’ media agency ZenithOptimedia. While most people use the
internet at home, during business hours, Zenith found that over 201
billion pages were viewed in April 2008. Other findings from the
Zenith’s web analytics after the jump:

-- Portals not dead yet: Yahoo (NSDQ: YHOO), Google (NSDQ: GOOG),
and MSN remain the most visited web destinations. YouTube’s reach has
gained 76.9 percent in since April 2007—impressive numbers, since the
top ten sites saw reach grow by just 6.8 percent; time spent on the top
10 was up 31.3 percent during the same period.

-- Uniques: In terms of unique visitors, online questions
and answers site Wikianswers was the top gainer (520 percent) and jobs
site Monster saw the largest decline (-26 percent) among the top 100
visited websites.

-- Reach by category: TV sites’ reach were up 8 percent,
“multimedia sites” grew 15.5 percent, time spent 14.6 percent; blogs
increased 15.6 percent, and online gaming sites climbed 12.3 percent.
And, no shocker to anyone, Google remains the online search leader as
of April 2008 with a 56.5 percent total share of searches.

Article Link

Internet Display Advertising Slowed In First Quarter

In the first quarter of 2008, the growth in spending on Internet
display advertising slowed to 8.5 percent from 16.7 percent growth last
year, according to estimates put out today
by TNS Media Intelligence. Even with the slowdown Internet ad spending
still grew faster than that for TV (1.7 percent), magazines (0.8
percent), newspapers (-5.2 percent), radio (-4.5 percent), and outdoor
(2.5 percent). The overall growth of all advertising spending that TNS
measures was flat at 0.6 percent growth over the first quarter of 2007.

TNS’s Internet numbers do not include search advertising, only
display ads. The quarterly total for all Internet advertising is closer to $6 billion.
But this data point is evidence that the Web may not be immune to
weakness in advertising spending overall. If the industry dives into a
full-blown advertising recession, many Web companies could feel the

This year, TNS only provided the percentage changes. Since it provided absolute dollar values last year,
I did my own math and put together the table below. In the first
quarter of 2008, $2.9 billion was spent on Internet display ads in the
U.S., representing an 8.3 percent share of the $35.1 billion total.
That puts Internet display advertising ahead of radio ($2.2 billion),
but behind newspapers ($6.0 billion), magazines ($6.8 billion), and TV
($15.9 billion). My figures are rounded, and the percent changes are

Article Link

Monday, June 9, 2008

Powerset vs. Cognition: A Semantic Search Shoot-out

Powerset vs. Cognition: A Semantic Search Shoot-out

Nitin Karandikar,
Saturday, June 7, 2008 at 9:00 AM PT Comments (10)

Powerset, which implements
semantic search, recently released a public beta based on the limited
data set of Wikipedia. But while there is no question that Powerset has
some interesting and valuable semantic search technology — many of
their demo queries produce meaningful summary pages and reference pages
with information extracted from Wikipedia content — there are other
semantic search engines that produce equally meaningful and relevant

In this post, we compare Powerset results with those of a demo implementation from one such search engine, Cognition Technologies. And we compare them both with the current gold standard in web search, Google (again, limited to the Wikipedia data set).

Example 1: Powerset

There are some classes of queries in which Powerset shines, such as
whenever the query involves extracting concepts or aggregation of data
from a given data set.

For example, check out the beautifully presented results for the
following queries that extract key information the user is looking for
and provide it in summary format:

“military intelligence”

“teams in the NFL”

Example 2: Cognition Technologies

On the other hand, there are other types of queries — especially
where hardcore semantic parsing is involved — where the Powerset
algorithms get confused, and Cognition gives better results:

“rare wildlife of the Amazon”

“football players who went to jail”

Example 3: Google

There are still queries (especially when semantic parsing is not
involved) in which Google results are much better than either Powerset
or Cognition:

“helicopter carrier Iwo Jima class”

Here, surprisingly, Google has the best results. Powerset has
related results, Cognition gets totally confused, but Google nails it!


One area where both Powerset and Cognition improve on Google is the
disambiguation of query terms. This is always a significant issue for
search engines; for example, when a user types in the keyword Java,
does she mean the island, the programming language, or the coffee?

Google has recently tried some experiments in this area, but these new search engines go one better.

When Powerset sees an ambiguous topic, it uses tabs to provide both sets of results:

Cognition handles it in a different way, by letting the user select from among different semantic meanings for each term:

User Impact

For most common searches, Google search works just fine. We’ve all
gotten used to the ubiquitous “keyword-ese,” currently the universal
language of web search. With Google’s unlimited resources,
comprehensive index and formidable prowess in finding relevant results
using the PageRank algorithm, it’s going to be difficult for any other
search engine to match those results. Users may have to work just a
little bit harder for unusual queries or specialized searches, but most
users will accept that trade-off in return for using their familiar and
beloved search engine. Indeed, the word Google has come to represent
web search in the same way that the word Xerox had once come to
symbolize the process of photocopying.

Future Competition

So what can Powerset (and Cognition) do to gain traction and capture users?

In their recent book, “The Innovator’s Solution,”
Clayton Christensen and Michael Raynor discuss how upstart companies
challenging market leaders and entrenched incumbents can position new
technologies for a reasonable chance of success. One approach that they
believe is guaranteed to fail is when these smaller upstarts try to
make evolutionary improvements to get and stay ahead of the major

Instead, they suggest shaping the new technology into a disruptive innovation, along either of the following two major axes:

1. New-market strategy: Leveraging the innovation to attract users
who do not typically participate in using the product or service, and
thus growing the market as a whole.

2. Low-end strategy: If there are price-sensitive, over-served
users who would be willing to trade some of the advanced functionality
in return for a lower price point, then the smaller players have an
opportunity to enter the market — that is, if they can figure out a way
to make a profit.

In other words, the new players entering the market have to find
profitable business opportunities in segments of the market that are
not attractive to market leaders.

Using this model, it is apparent that a strategy of challenging
Google head-on for control of the mainstream web search market has
little hope of success, regardless of the new technologies or search
innovations that are applied. Google would have no choice but to fight
back with everything it’s got to catch up to or leapfrog this “better
search” alternative.

Similarly, since Google search is free for users, there is really no
viable low-end strategy, no way to outdo the existing search leader by
offering a lower price point.

What about non-participant users? Practically everyone online
already uses a web search engine (with Google being the overwhelming
favorite). However, Google search follows a specific, consistent set of
guidelines: simplicity of UI, speed of response, and relevance based on
incoming links. These design parameters take top priority over all
other considerations.

By challenging these assumptions, we can discover new use cases in
search that are underserved (or not served at all) by Google. Some
examples include:

1. UI Simplicity: Google’s minimal UI is trivially simple to use
and ideal for a one-size-fits-all model, but it may be less than
optimal for complex semantic searches. As Alex Iskold points out in his
recent article on the myth and reality of semantic search,
a richer user interface would allow power users to express
semantically-rich search queries and get back better results. Notably,
Powerset and Cognition excel at these types of queries.

2. Speed: For some types of advanced searches, users might be
willing to wait, perhaps even as long as a day, in order to get back
semantically complex results. Imagine a software agent that acts as a
virtual search assistant - once the user specifies a query with
multiple levels of complexity and dependency, the agent goes off and
returns the next day with a list of possible results/options. Queries
that require the coordination of complex tasks fall into this category,
such as planning a trip that requires coordinating air travel, hotel
and car, and minimizing the cost of the whole trip while taking some
additional factors into consideration.

3. Relevance: Although all the mainstream search engines use similar
criteria to evaluate relevance (mainly, the evidence of incoming
links), other relevance algorithms are certainly feasible and may work
better for certain classes of queries. Social relevance is an obvious
example; reputable premium content is another.

This post is in no way meant to discredit Powerset — they’re in
early beta and are doing a fine job of building semantic search.
Instead, the examples above clearly demonstrate that the jury is still
out on semantic search; other search engines are also contenders in
this space, and the race is far from won.

Article Link

Thursday, June 5, 2008

Thinkbase: Mapping the World's Brain

If Freebase is an "open shared database of the world's knowledge," then Thinkbase (found via information aesthetics)
is a mind map of the world's knowledge. The interesting and incredibly
addictive Freebase visualization and search tool is the brainchild of
master's degree student Christian Hirsch at the University of Auckland.
Thinkbase is one of the cool proof of concept applications built on top
of Freebase that we mentioned last week.

As we've mentioned here on RWW, Freebase is best suited for complex
inferencing queries -- the type that expose relationships between
various entities to figure out an answer. Things like, "What's the name
of the actor who was in both "The Lord of the Rings" and "From Hell?"
(Answer: Ian Holm)

Thinkbase doesn't necessarily answer those questions -- at least not
directly, but it does allow people to visually explore the
relationships that Freebase can expose. Thinkbase employs the Thinkmap
visualization software to visually represent the semantic relationships
between objects on Freebase as an interactive mind map. Each object on
the map is represented by an icon that corresponds to the type of
object it is. For example, person, place, movie, song, or artwork.

The site uses a two-pane display, putting the relationship map in
the left pane, and the Freebase entry for the active node in the right
pane. Every node on a Thinkbase map and be expanded to see concepts
related to that object, or collapsed to clean the graph of
relationships you're unconcerned with. Every map you create can also be
linked to via a dynamic share URL.

Thinkbase is a really fun visual front end to the Freebase database
that exposes the semantic relationships that such a database can reveal
in a compelling way. Alex Iskold wrote last week
that the problem with semantic search is that we're asking the wrong
questions. Tools like Thinkbase can help us start to think about what
type of questions we should be asking by clearly showing the type of semantic relationships that databases like Freebase excel at finding

Article Link

Tuesday, June 3, 2008

Google Search Ads Rile Its Big Customers

As Google Inc. pushes to sell ads crucial to its
revenue growth, some of its largest advertisers are growing angry with
the way the company oversees its sponsored searches.

Felman, chief marketing officer for Mark Monitor, speaks to WSJ's Emily
Steel about a new, deceptive form of search advertising. (June 2)

The problem is a tactic known as "piggybacking," in
which smaller advertisers use major players' brand names, slogans or
other trademarked words in the text of search ads to lure Web surfers
to their own sites.

While Google
and other search engines have policies against this maneuver, some
marketers say the practice often goes unchecked. The brick-and-mortar
world has long-established laws in this area, but the legal situation
is less clear for the Internet and has only recently started to be
tested in the courts.

Tensions over piggybacking have been simmering for a couple of years. Companies such as Marriott International Inc., InterContinental Hotels Group PLC, AMR Corp.'s American Airlines and Northwest Airlines
Corp. say the use of their names and slogans in the text of other
companies' search ads confuses potential customers and increases their
cost of doing business. They are particularly upset with Google, which
is the dominant player in the search business. It controlled 71.2% of
the search market last year, according to research firm eMarketer Inc.

As a result, Google could face a backlash as it
attempts to grab a bigger share of other advertising niches, including
display advertising and video ads. Big advertisers say they may punish
Google if they aren't satisfied with the way the piggybacking dispute
is dealt with. "This does play into our decision of overall spending --
it has to," says Michael Menis, vice president of global marketing
services at InterContinental.

[google ad]
Some Google advertisers are upset their names are in ads for other sites.

Adds John Gustafson, director of distribution and
Internet strategy at Northwest Airlines: "If Google has an inability to
help us resolve issues about abuses of our brand, that would impact our
decision to participate in future forms of advertising."

Last August, American Airlines filed a suit against
Google in federal court in Fort Worth, Texas, seeking restitution for
damages caused by trademark infringement on the search engine. The
airline is asking Google to stop selling its trademarked terms to other
advertisers. This practice is "utilizing our brand that we've built for
more than 80 years for the benefit of someone else," says American
Airlines spokesman Billy Sanez.

Google says it is disappointed that the court denied
its motion to dismiss the lawsuit. It believes the suit lacks merit.
"Google's trademark policy strikes a proper balance between trademark
owners' interests and consumer choice and has been validated by prior
court decisions," a Google spokeswoman says.

Google acknowledges that piggybacking occurs and says
that when it gets complaints, it investigates the claims and tries to
stop the practice. "We have a long-running policy where we don't allow
advertisers to use trademarked terms in ad text to avoid creating any
user confusion," says Richard Holden, a product-management director at

The other main players in the search-advertising market are Yahoo Inc. and Microsoft Corp. Both say they have policies similar to Google's.

The way search-engine advertising works, marketers bid
on key words in a continuous auction. InterContinental, for example,
bids on millions of key words a day from Google in 11 different
languages. Among them are its own brand names, such as "Holiday Inn
Express" and "Crowne Plaza Los Angeles." When a consumer searches for
any of the words, the company's ad appears above or next to the
results, depending on the amount the company bids and an algorithm
Google uses to determine an ad's relevance to a search.

Companies only pay Google for the key words if someone clicks on their search ad.

For large companies, the frustration comes when their
names and other well-known phrases are used in the text of a search ad
leading to an unrelated site. A recent Google search using the words
"Marriott Atlanta," for instance, brought up an advertiser-paid link
labeled "Marriott Atlanta." That led to, a discount
hotel-reservations site. But a link on the site for a Marriott hotel
room in Atlanta ultimately led to an error page. Marriott says the site
isn't authorized to use the Marriott name in its online text. didn't respond to requests for a comment. The link on Google has since disappeared.

The piggybacking that Marriott, American and others
are complaining about is not to be confused with another practice known
as "conquest buys," in which marketers buy a competitor's term so that
an ad for their own product appears when a consumer searches for the
other brand. The difference is, the text of the ad doesn't contain the
competitors' name or slogan. While companies have also protested this
practice, Google's policies allow it, unlike piggybacking.

Piggybacking is a big problem for marketers that do a
significant amount of business online, experts say. If it is allowed to
continue, companies seeking online visitors will be forced to pay more
to advertise in search engines because rising demand will force up the
cost of key words, says Eric Clemons, a professor at the University of
Pennsylvania's Wharton School who follows the search-ad business.

The companies interviewed for this article say they
aren't able to put a dollar amount on their claims of lost business as
a result of the piggybacking. But concerns like InterContinental, which
spends more than half of its online marketing budget on search ads, say
they depend on these ads to generate sales. "Any research will tell you
search is the place where people research travel," Mr. Menis says.

A recent Google search with the words "Holiday Inn
Orlando" brought up a sponsored link labeled "Holiday Inn Orlando." It
led to, an online travel comparison-shopping site.
InterContinental Hotels, which owns Holiday Inn, says is
not authorized to advertise using the Holiday Inn name. says it bids on millions of search terms
at any given time and often uses Google's automatic system to generate
its advertising copy. "What we rely on Google to do is to essentially
stay within its own policies so that if a given key word or a search
term that we are bidding on should not show up in the search ad, it
doesn't," says Steve Yi, senior vice president of Oversee Marketing
Services, which owns says if it is notified
of a violation, it immediately takes down the ad.

Some advertisers are demanding that Google and other
search engines create an automatic system that will only allow
advertisers to use other companies' names and slogans in the text of
search ads if they have permission.

But Google says its system works. "We are trying to
balance advertisers and trademark owners and user interests," Mr.
Holden says.

Article Link

Ad Network Collective Media Acquires Audience Targeter Personifi

Online ad network Collective Media has bought ad targeter Personifi,
the company told paidContent. Specific terms weren’t disclosed,
although a Collective Media rep said it was “an eight-figure deal” in
cash and stock. Last October, New York-based Collective Media raised an
undisclosed first round from Greycroft Partners and iNovia Capital. At
the same time, Collective began collaborating with Fort Worth,
Tx.-based Personifi on audience targeting and “content classification.”
It currently uses Personifi on 50 percent of its campaigns. Collective
scored its most notable assignment in February, when it was chosen to power QuadrantOne, the online newspaper advertising alliance backed by The Tribune Company, Gannett (NYSE: GCI), Hearst and the New York Times (NYSE: NYT) Company
Article Link

Wednesday, May 28, 2008

Freebase: Dispelling The Skepticism

Freebase the first product of semantic web company Metaweb, is an open, semantically marked up database of information that we called one of the "10 semantic apps to watch" last year. With $57.4 million in funding, a smart team, and a tech legend in Danny Hillis at the helm, Metaweb is considered to be one of the most serious players in the Semantic Web space. Yet the company's
efforts to date have been met with skepticism. Particularly, people have asked how is Freebase different to Wikipedia? Jamie Taylor, the Minister of Information at Metaweb, spoke at the SemTech 2008 Conference that took place in San Jose last week in an effort to dispel some of that skepticism.

What is Freebase?

Jamie has an interesting title: Minister of Information, and his
primary responsibility is to seed Freebase with information and ensure
the quality of the data. According to Jamie, Freebase is "open shared
database of the world's knowledge."
This sounds the same as Wikipedia, but it is really quite different,
because at the heart of Freebase are the ideas of semantics and
openness via API.

Unlike Wikipedia, which is a free form database, Freebase is structured, where concepts and relationships
are interlinked into a gigantic network or graph. Another important difference is that Freebase is all about its API.
Any information contained inside the database is accessible and can be retrieved via queries. In addition, the data
in Freebase is under a Creative Commons license - meaning that is readily exportable and useful by others.

When it comes to defining the meanings of things, Freebase is focused on community, with collective editing, attribution,
and collaboratively built semantics. This last point is quite crucial - the founders of Freebase believe that meaning
has to emerge from the collaboration between users. As such, Freebase is one of the first experiments of web-scale
social contracts. The site is really focused on the notion that information is not encumbered by licenses and is free to use.

What is in Freebase Today?

Data comes into Freebase from many sources: Wikipedia, Flickr, the
US Department of Commerce, Music Brainz, the USGS,
SFMOMA, the US Exchange Commission, Chef Moz, and many other places.
Right now the information is mostly about people and places, but the
is engineered to have a wide range of data types. As an example of
"People" information, there
is a lot of information in Freebase about artists along with their
artwork and place in history.
More esoteric types of information you might find in the database
include airplanes, french cheese, tropical storms in the 90s,
oil companies, and candies.

Freebase also contains lots of other kinds of data and has:

  • 3.4 Million Subjects
  • 750K People
  • 450K Locations
  • 50K Companies
  • 40K Movies
  • ... Over 1K Data Types with over 3K Properties

Data Representation in Freebase

While Freebase certainly has long way to go before it can claim
completeness of information,
its core idea of object representation and linking seems very solid.
Each object in Freebase is unique.
As more information comes into the system about an object, more links
are created about it in the system.
It is particularly interesting how Freebase establishes object identity
and decides that two concepts (or subjects) are the same.

The diagram above illustrates the idea. When a new source of information is added to Freebase, it is parsed into
entities and facts. The new information is then cleaned up and is merged with the existing system. But
the merge only occurs if the system determines that the two bits of information are really about the same subject (in
this case Leonardo Da Vinci). This is a powerful approach which allows Freebase to grow the knowledge around individual
subjects. What is also interesting is that Freebase allows human editing to reconcile situations when the system
is unable to automatically link the two concepts together.

Each permanent object in the system has a GUID - a unique
identifier, something like this: #9202a8c040000064.....
The identifier can be used to refer to the object via URL and via
queries. In addition to the GUID, there are other
ways to refer to the object, for example, Beyond that, there
are even other
aliases, for example, you can refer to a public company by its stock
ticker symbol. But regardless of the reference, the key point is that
you end up with the same, unique node in the system.

Freebase also has the ability to create new domains and types that describe new concepts, for example, science fiction movies.
There is a way to attach new data types to the existing domains, and then these types can be shared and used by other users.
The idea is that you can model things with the fine grained resolution that you need and then you can invite people
to help you refine and evolve your models. An example is the motorcycle community, which evolved out of an effort led by
one guy and who was then joined by others, and has since been promoted to the top level. The community process
is about merging private types to build common models.

What Can You Do With Freebase?

Freebase is not a formal system, it is not a reasoning engine, it is just a knowledge repository, a database.
To query Freebase you use the Metaweb Query Language (MQL), which is based on JSON. The language is meant to be very simple
and it is actually very interesting as well. The idea is that you fill out a tree which represents a partial
graph with pieces that you know and then the system basically fills in all the slots that you left blank
and delivers back all possible subgraphs.

For example, say you are watching a movie and you can't tell what it
You know that the movie stars Patrick Swayze and an actress who was
also in "Tank Girl." So you create a movie query and express all these
facts, using JSON-style syntax.
And when you run the query you get back that the actress is Lory Petty
and the movie
is "Point Break" and you also get links to IMDB. So the query and the
results have the
same structure and to find matches you simply traverse the set of
results that is returned.

Building on this example, Freebase is really meant for complex inferencing queries, the sorts
of questions that Google has no way of answering using its statistical frequency algorithms.
For example, what US senators took money from a foreign entity? Turns out that both Barak
Obama and Hillary Clinton received donations from UBS AG, based in Switzerland. That is a complex
inferencing query that needs to be expressed in a query language before it can be answered
and so questions of this nature are outside of the reach of any search engine -- and Wikipedia too, for that matter.


There is quite a lot of activity going on around Freebase today.
Many enthusiasts are building small proof of concept applications
showcasing what can be done
in the future with this powerful database. You can stay on top of the
cutting edge stuff coming both from the Freebase team and community at: and

Article Link

Friday, May 23, 2008

You Play a Game, Computers Get Smarter, AI Starts to Work

Last week a new site called Gwap was launched by Carnegie Mellon's School of Computer Science.
The site offers an array of multi-player games that have a benefit
beyond just that of momentary distraction or amusement. These games are
helping improve image and audio searches, teaching computers to see,
and enhancing AI. However, all that won't matter to the players
because, as it turns out, these games are actually fun.

About Gwap

Nicholas Carr blogged about Gwap
a couple of days after its launch, noting that "one thing the Internet
enables, which wasn't possible before, at least not on anywhere near
the same scale, is the transfer of human intelligence into machine
intelligence." In Gwap, which stands
for "Games With a Purpose," that transfer of intelligence is done by
getting people to do the routine chores that computers don't know how
to do - chores like tagging photos, describing songs, and outlining
objects, as well as transferring a good bit of human common sense to
the machine. The trick to getting people to do these things is to make
the work fun. Hence the games.

The creator of these games is Luis von Ahn, winner of a 2006
MacArthur Foundation "genius grant" and a pioneer in the field of human
computation. Ahn is most notable for helping to develop CAPTCHAs
(Completely Automated Public Turing Test to Tell Computers and Humans
Apart), those somewhat annoying but rather effective distorted letter
puzzles used millions of times each day. Last year, he also introduced
the "reCAPTCHA," where CAPTCHAs were used to gain access to a web site while also helping digitize old books.

Gwap homepage

The Games

Gwap currently features five games, one of which is an old classic called the ESP Game.
In the ESP game, two players view the same image and try to guess words
that the other player would use to describe it. Google licensed this
technology and launched Google Image Labeler to help improve the quality of their image search results.

The four new games include:

Matchin, a
game in which players judge which of two images is more appealing, is
designed to eventually enable image searches to rank images based on
which ones look the best.
Tag a Tune,
in which players describe songs so that computers can search for music
other than by title - such as happy songs or love songs.
Verbosity, a test of common sense knowledge that will amass facts for use by artificial intelligence programs.
a game in which players trace the outlines of objects in photographs to
help teach computers to more readily recognize objects.

According to the Carnegie Mellon announcement, von Ahn plans to add a lot of games to the site, saying "we have three more that we'll be launching in the coming months."
He hopes that by having all the games on the same site it will
encourage players to try several different ones. Players also have a
single sign-on and password, Top Player rankings, and online chats,
said von Ahn.

The Human Processor

In his whitepaper entitled "Invisible Computing," von Ahn compared game design to to algorithm creation, saying:

" must be proven correct, its efficiency can be
analyzed, a more efficient version can supersede a less efficient one,
and so on. Instead of using a silicon processor, these "algorithms" run
on a processor consisting of ordinary humans interacting with computers
over the Internet."

In other words, we're the processor. The machine is us.

This concept isn't entirely new - Amazon's Mechanical Turk,
for example, pays people to contribute their time to work on small,
simple tasks called "Human Intelligence Tasks," or HITs. However,
unlike HITs, which can sometimes be boring or tedious, the games on
Gawp are actually fun - and they don't feel like work.

Some believe that human powered processing is the next big wave for computing. You could argue that Mahalo, the human-powered search engine is an example of this. (Though others call it a human-powered link farm.) Perhaps a better example is ChaCha,
the mobile Q&A service that uses human guides to respond to
questions called or texted in from your cell phone. We've also covered
other human-powered services on RWW in the past, like the Galaxy Zoo
and Stardust@Home project, among other (our coverage here). Many of these efforts have tried to incorporate an element of "fun" into what is actually work.

Whether Gwap will actually gain
momentum and get a large number of people involved is yet to be seen,
but it is definitely has potential to help teach computers the things
they can't do for themselves....yet.

Article Link