Monday, November 26, 2007

The wireless epidemic

The wireless epidemic by Jon Kleinberg

At one end are network models that reflect strong spatial effects, with nodes at fixed positions in two dimensions, each connected to a small number of other nodes a short distance away [9]. At the other end are ‘scale-free’ networks, which are essentially unconstrained by physical proximity, and in which the number of contacts per node are widely spread [14]. Models based on human travel data occupy an intermediate position in this spectrum of spatial constraints. The different network structures lead in turn to qualitative differences in the way epidemics spread: whereas epidemics can persist at arbitrarily low levels of virulence in scale-free networks[14,15], epidemics in simple two-dimensional models need a minimum level of virulence to prevent
them from dying out quickly [9].

Bluetooth ...is disrupting this dichotomy by making possible computer-virus outbreaks whose progress closely tracks human mobility patterns. These types of wireless worm are designed to infect mobile devices such as cell phones, and then to continuously scan for other devices within a few tens of metres or less, looking for new targets. A computer virus thus becomes something you catch not necessarily from a compromised computer halfway around the world, but possibly from the person sitting next to you on a bus, or at a nearby table in a restaurant.

9. Durrett, R. SIAM Rev. 41, 677–718 (1999).
14. Pastor-Satorras, R. & Vespignani, A. Phys. Rev. Lett. 86, 3200–3203 (2000).
15. Berger, N., Borgs, C., Chayes, J. T. & Saberi, A. I. Proc. 16th ACM Symp. Discr. Algor. 301–310 (ACM, New York, 2005).

The impact of social structure on economic outcomes

Some extracts from The impact of social structure on economic outcomes.

4 core principles:

1) Norms and Network Density. ... the denser a network, the more unique paths along which information, ideas and influence can travel between any two nodes. Thus, greater density makes ideas about proper behavior more likely to be encountered repeatedly, discussed and fixed; it also renders deviance from resulting norms harder to hide and, thus, more likely to be punished. ... larger groups will have lower network density because people have cognitive, emotional, spatial and temporal limits on how many social ties they can sustain.

2) The Strength of Weak Ties. More novel information flows to individuals through weak than through strong ties. Because our close friends tend to move in the same circles that we do, the information they receive overlaps considerably with what we already know. ...This is so even though close friends may be more interested than acquaintances in helping us; social structure can dominate motivation. This is one aspect of what I have called “the strength of weak ties” (Granovetter, 1973, 1983). ... if cliques are connected to one another, it is mainly by weak ties. This implies that such ties determine the extent of information diffusion in large-scale social structures. One outcome is that in scientific fields, new information and ideas are more efficiently diffused through weak ties.

3) The Importance of “Structural Holes.” Burt (1992) extended and reformulated the “weak ties” argument by emphasizing that ... the strategic advantage that may be enjoyed by individuals with ties into multiple networks that are largely separated from one another. Insofar as they constitute the only route through which information or other resources may flow from one network sector to another, they can be said to exploit “structural holes” in the network. ... One reason resources may be unconnected is that they reside in separated networks of individuals or transactions. Thus, the actor who sits astride structural holes in networks (as described in Burt, 1992) is well placed to innovate.

Prospective employers and employees prefer to learn about one another from personal sources whose information they trust. This is an example of what has been called “social capital” (Lin, 2001). ... for goods where assessment is difficult, such as used cars, legal advice and home repairs, one-quarter to one-half of purchases in the United States are made through personal networks.

Studies of peasant markets often suggest that “clientelization,” defined as dealing exclusively with known buyers and sellers, raises prices above their competitive level

Social relations are also closely linked to productivity. Economic models attribute productivity to personal traits, modifiable by learning. But one’s position in a social group can also be a central influence on productivity, for several reasons. One is that many tasks cannot be accomplished without serious cooperation from
others; another is that many tasks are too complex and subtle to be done “by the book” (which is why the “rulebook slowdown” is a potent labor weapon) and require the exercise of “tacit knowledge” appropriable only through interaction with knowledgeable others.

“loyalty systems”—attempts to elicit cooperation from workers deriving not only from incentives but also from identification with the firm or with some set of individuals that encourages high standards and productivity.

Your personal data? If you can't take it out, don't put it in

Many companies are going open. Are they? Take OpenSocial (APIs by Google), Open Handset Alliance (alliance of 34 companies led by Google), and Open Media (an initiative announced by Bebo). What is common to all three initiatives, apart from the use of the word "open", is that none is directly aimed at benefiting the user (here).

Google is clearly "not responding to consumer needs. The applications it has
demonstrated using Android are readily available on existing phones and operating systems. Users are not crying out for yet another interface for their phones".

Tim O'Reilly: "We don't want to have the same application on multiple social networks, we want applications that can use data from multiple social networks".


In another FT's article:
"the technology industry has little financial incentive to reduce switching costs. While users are free to switch from one service to another at any time, the critical question is: can they take their data with them? Can they take their photos, their videos, their e-mails? And how easy is that? Data are often stored in proprietary file formats, which are protected by patents, and those are controlled by software and service vendors."

"Which raises the question: Do you actually own your own data?" The answer is unfortunately a qualified no! A very interesting research direction, right?

Sunday, November 25, 2007

Cold Reading, Statistical Discrimination and Initial Trust

"Cold reading is a technique used to convince another person that the reader knows much more about a subject than they actually do. Even without prior knowledge of a person, a practiced cold reader can still quickly obtain a great deal of information about the subject by carefully analyzing the person's body language, clothing or fashion, hairstyle, gender, sexual orientation, religion, race orethnicity, level of education, manner of speech, place of origin, etc. This technique is also called offender profiling. Cold readers commonly employ high probability guesses about the subject, quickly picking up on signals from their subjects as to whether their guesses are in the right direction or not, and then emphasizing and reinforcing any chance connections the subjects acknowledge while quickly moving on from missed guesses".

This definition of cold reading reminded me of Posner's (harsh) review of Blink - Blinkered (html). There are two points from Posner's essay that may suggest how person A may set her initial trust in B. The first point may suggest that A does so based not only on B's behavior but also on the (social) group(s) to which B belongs. The second point may remind us that: asking for recomendations about B may be costly; and that Bayesian reasoning may help in rationally deciding whether to trust. Here are the two (by now-coveted) points:

(1) "If two groups happen to differ on average, even though there is considerable overlap between the groups, it may be sensible to ascribe the group's average characteristics to each member of the group, even though one knows that many members deviate from the average. An individual's characteristics may be difficult to determine in a brief encounter, and a salesman cannot afford to waste his time in a protracted one, and so he may quote a high price to every black shopper even though he knows that some blacks are just as shrewd and experienced car shoppers as the average white, or more so. Economists use the term "statistical discrimination" to describe this behavior. It is a better label than stereotyping for what is going on in the auto-dealer case, because it is more precise and lacks the distracting negative connotation of stereotype, defined by Gladwell as "a rigid and unyielding system." But is it? Think of how stereotypes of professional women, Asians, and homosexuals have changed in recent years. Statistical discrimination erodes as the average characteristics of different groups converge."

(2) " Such pratfalls, together with the inaptness of the stories that constitute the entirety of the book, make me wonder how far Gladwell has actually delved into the literatures that bear on his subject, which is not a new one. These include a philosophical literature illustrated by the work of Michael Polanyi on tacit knowledge and on "know how" versus "know that"; a psychological literature on cognitive capabilities and distortions; a literature in both philosophy and psychology that explores the cognitive role of the emotions; a literature in evolutionary biology that relates some of these distortions to conditions in the "ancestral environment" (the environment in which the human brain reached approximately its current level of development); a psychiatric litetature on autism and other cognitive disturbances; an economic literature on the costs of acquiring and absorbing information; a literature at the intersection of philosophy, statistics, and economics that explores the rationality of basing decisions on subjective estimates of probability (Bayes's Theorem); and a literature in neuroscience that relates cognitive and emotional states to specific parts of and neuronal activities in the brain. "

Friday, November 23, 2007

Is Britney Spears Spam?

From my old blog (21st August 2007):
In the last post, I was raving on about ..., uhm, probably about trust bootstrapping, right? :-) I went from a definition of cold reading to a very personal interpretation of Posner's review of Blink. Now, in the same vein (i.e., keeping on being delirious), I move on this nice paper, which carries out (in a way) not offender profiling but MySpace user profiling.
Title: Is Britney Spears Spam? (pdf)

Problem: In social network websites (e.g., MySpace), to decide whether to accept invitations to connect, users manually examine the senders' profiles. However that may be time consuming!

Existing Solutions: One may automate the acceptance of invitations by having users running trust propagation algorithms.

Complication: The authors write that using current trust propagation algorithms may be less than desirable since trust both decays with the number of hops and is usually one-dimensional.

Proposal: Use machine learning techniques to classify user profiles. The classification describes a profile across two dimensions: sociability and promotin. Based on these dimensions' values for a profile, users then decide whether
to accept the invitation of that profile's user. To come up with a dataset on which to evaluate their algorithm, the authors randomly select and rate by hand MySpace users.
Future: I would:
> Apply a new trust propagation algorithm (pdf) to avoid trust decay and apply TRULLO (pdf) to handle multi-dimensional trust.
> Look at literature on criminal profiling. In UCL's main library, I noticed many books about criminal profiling. I wonder whether those books could inform a (future) paper titled "On profiling (not only criminals but) Web 2.0 users" ;-)
> Look at literature on statistical discrimination (previous post) and on customer profiling (mining customer data).
> Consider Tim Finin's comments:"It would be interesting to see how well various measures of the network structure around false and true profies serve as features. I think this is very similar to the problem of recognizing spam blogs (splogs). In our work, we’ve found that local features work well, but splogs can also be recognized by looking at the network structure as well."

Thursday, November 22, 2007

Efficient and Decentralized PageRank Approximation

I read this very well-written paper. The authors set out to design a way to compute (an approximated) pagerank in a distributed and efficient way.

"Starting with the local graph G of a peer, the peer first extends G by adding a special node W, called world node since its role is to represent all pages in the network that do not belong to G. An initial JXP score for local pages and the world node is obtained by running the PR algorithm in the extended local graph G' = G+W.
...
we take all the links from local pages to external pages and make them point to the world node. ... as the peer learns about external links that point to one of the local pages, we assign these links to the world node."


Optimized Merging

"At a peer meeting, instead of merging the graphs and world nodes, we could simply add relevant information received from the other peer into the local world node, and perform the PR computation on the extended local graph and still the JXP scores converge to the global PR scores."

Fortune companies don't blog

"Just 6 percent of the Fortune 500 companies have one [blog], according to Socialtext's Fortune 500 Blogging Wiki.

Why is this? If blogging is still not widespread among companies, it is basically because they are afraid to lose control of their messages , they fear the transparency effect and are not altogether convinced of the legal limits of this medium."

http://www.iese.edu/en/files/6_34173.pdf

Monday, November 19, 2007

How innovation happens in Silicon Valley

NESTA - How innovation happens in Silicon Valley, London 20/11/07
George Osborne MP will deliver a keynote address. Reid Hoffman, Megan Smith and Javes Slavet will take part in an interctive panel discussion to consider why the Silicon Valley has been so successful, and what lessons can be learned for the UK.

Sunday, November 18, 2007

The death of mass advertising?

Facebook Tries 'Social Advertising'. ..."a Facebook user who rents a movie on Blockbuster.com will be asked if he would like to have his movie choice broadcast out to all his friends on Facebook. And those friends would have no choice but to receive that movie message, along with an ad from Blockbuster."

MySpace reveals 'targeted' ads - "a pilot scheme that allows it to sell advertisements targeted to the individual tastes and interests of its millions of users...[It] will give advertisers the ability to drill down into 100 different user segments. This will allow them to differentiate between fans of romantic comedy films and action films, for example."