Sunday, December 8, 2013

Counting What Counts

(…and the qualitative versus quantitative divide.)

Not everything that can be counted counts, and not everything that counts can be counted.
-William B. Cameron, sociologist, 1967.

Anthony Towndsend's book Smart Cities.

Saturday, November 9, 2013

William Whyte and Benches

One thing I recall from William Whyte's fantastic The Social Life of Small Urban Spaces is one idea about people: people like other people (so we like people watching), but some people (typically who own, maintain, or work next to a small urban space) don't want "undesirables" in that space while "undesirables" don't want other people around and so you want things like benches.

Here in NYC, we have some recent citywide benches. People can sit, relax for a moment, and watch other people. Passersby might feel a little better with some other people sitting there. But, what's interesting about these benches is the little raised seat dividers -- people can't sleep on these benches, probably addressing the "undesirables" issue that Whyte wrote about, while the rest of the bench lets people sit there, as we like to do.

Thursday, October 31, 2013

Halloween in EQII

Well sure, there's Nights of the Dead or whatever, but here's some fun homage I mean IP infringement almost I mean homage. Hopefully the text flow is ok, but we have "Norm Baites" and we can also wonder how many licks it takes to get to the center of a… well just a lollipop.

Wednesday, October 23, 2013

R and Regex Named Matches

I use Python and R to do stuff, Python for web scraping and text clean up, R for the analysis. But people have expanded the functionality of the two, and they are overlapping (it's enough to not get them confused as it is some of the time). I found I needed to use named groups in regex in R, and... couldn't figure it out. The web did not help.

SHORT VERSION: Turn on the Perl regex style (perl = TRUE) and go read some Perl regex pages, you'll be fine. Name the match: (?<name>...), to match it later: \\g{name}
This is completely different from what I was used to.
Google blogger will try to blow this post up as I want to have greater than and less than symbols. Yeah they can't get that right. Oh maybe it's working.

Typically, if I use regular expressions it's in Python, but R can do it too, and sometimes you'll want to do that. But there isn't a ton of help online about it (despite the links I have lined up to include below) and there are some things that confuse the issue (to Perl or not to Perl...).

If you just want to do some work with strings, first check out Hadley Wickham's stringr. It's awesome.

I, however, wanted to do some pattern matching that included a repeated section, so I needed regex's named group functionality, which I couldn't find or figure out in stringr. I was looking for patterns like this:


...where 5 could be any number between 0 - 500 or so, but it would repeat. I had already removed spaces and added commas for easier parsing. (So other matches would be, like, 17,-1,17,-1,17.... etc.) So I needed to make sure the first match there was repeated, thus, named groups (or any group capture really, but I wanted to name it).

But I also couldn't figure it out in R. I can do it in Python, but the Python code for regex wouldn't work in R, alas. It was not clear what changes needed to be made.

One reason was the the \ needs to be escaped, that is, \\. So for example, \d+ needed to be \\d+. That wasn't too hard to figure out. But the rest was.

You can have Perl style, or POSIX, or not. Uh, what? No idea! I just needed it to work. Specifically, named groups in R. I found this page which said "Named subpatterns... are not covered here." Hmm. Another page said how "examples for the use of regex in R are rather rare" and had some useful examples. Eventually I figured I would set the Perl option and see what I could do; at least I could search on "perl", and that made all the difference as I could find out how to do named groups in Perl-style regex and there you go.

Name a group in Python: (?P<name>...)    
Name a group in R, Perl style: (?<name>...)

Note: I expect the less than and greater than symbols fail at some point.

Reference it later in Python: (?P=name)
Reference it later in R, Perl style: \\g{name}
    So curly braces (Perl?), and double backslash for R.

Some useful Perl-regex links:

Although honestly one problem I have with a lot of online examples (and the R help files) is that they are completely arcane. If I'm looking for help with syntax, a complex example isn't going to solve it, that's bad usability.

Post Keywords: regex, R, r-project, cran, grep, regular expressions, named groups.

Monday, October 21, 2013

iTunes Radio

Still working out a few kinks, as you can see (not repeating recently played songs). The result is I have had Fiona Apple stuck in my head for four days. (Not 'shopped.)

R and head() and tail()

So, if you're not careful using R's head() and tail() commands, you'll end up with a little surprise. Perhaps I should say not careful reading the documentation.

Head() and tail() do not return just one item from the list (or whatever), they return several. So head does not mean first, and tail does not mean last.

Read carefully: "Returns the first or last parts of a vector, matrix, table, data frame or function" [Italics added.] PARTS. Plural. An 's' on there.


our_list = list(3, 7)       # Makes a list with two items, the first is 3, the second is 7. Integers.

Note I used "=" since if I use a "less than" bracket, Google blogger freaks out. (The typical R code is "less than" followed by a dash, which make an arrow, representing "gets", the left side gets [is given] the right side.) It's giving me a hard time with formatting as it is.
If you type in "our_list", R will print our_list:

[1] 3

[1] 7

So, the first item in our_list [[1]] has one item [1], which is a 3.
The second item in our_list [[2]] has one item [1], which is a 7.

I don't fully understand the difference between [[x]] and [x], it seems mysterious.
Edit: The R Inferno, 8.1.54.... aha. Still is mysterious, though.

If you type:
head(our_list) would be nice to get just the head, that is, the first item. But no, you get the whole list (since the list is small you get the entire lists, larger lists would only return the first few items).

What you want is:
head(our_list, n=1)

...where the 'n' gives the value of how many items you want. (You don't actually need the n=, I have noticed.)
When I try "n=3" for this two item list, it just gives the first two items (i.e., the entire list in this case) and does not give an error.

Note I made our_list have [[1]] == 3 and [[2]] == 7 since far too often [[1]] == 1 and [[2]] == 2 and really people that's just not clear. If you're trying to make a useful example, don't make it where the same symbols (1, 2) are being used to represent widely different things.

Also, Googling for info on R's "by" command is just impossible, as "r by" is not a specific enough search string (in-context it's fine though). That's why I like books (yes paper) sometimes, if the index is any good, there you go.

Sunday, October 6, 2013

R For Loops Indexing

R does something a little unexpected -- well, unexpected to me -- with the indexing of the for loop (and maybe this is more general, I don't know).

If you have... (note I can't use "get" with the arrow made of brackets, Google does not parse that in terms of HTML and it kills the code...)

n = 5
for (i in 1 : n+1) {
    do stuff

...the index is 2 to 6, not 1 to 6. The +1 gets added to both the indices.
So it's like (i in (1:n) + 1) kind of.

What you need is....

for (i in 1 : (n+1)) {....

This is related to off by one errors (humorous explanation and more serious explanation), but I certainly didn't expect it to be parsed like that.

Thursday, September 26, 2013

PS2/PS3 Controllers and Time

I loved my PS2. Some great games. So great that I bought a used PS3 to get the kind that does the PS2 emulation in hardware and not just software (although no, I have not actually played any of my PS2 library, but that's a different story -- I could if I wanted to). But I had it for a long time. And, I've had the PS3 for a long time? How long? Well, that's the problem.

I've had them so long that the rubber on the thumbsticks degrades and becomes sticky. Not just a little grippy, or tacky, but sticky, in a gross, "this rubber is chemically falling apart" way. That's a problem. (Well ok I got rid of my PS2, but that was after the sticky thumbsticks.)

This has not happened (yet?) with the controller for my XBox 360.

Monday, September 23, 2013

iPhones and Finger Prints

What all of these excitable "we can copy fingerprints!", "fingerprints make horrible security!" posts miss is the really awesome thing about fingerprints versus a security code on your phone: Your 9 year old will try really, really hard to see what your code is (as will your 8 year old, your 7 year old, your 6 year old, your 10 year old, and so forth), but kids cannot copy your fingerprint and make a usable replica to get into your phone--unless you have trained your kid to be that amazing, and then not only do you have other things to focus on besides your kid making fake fingers but your kid probably has his or her own device and doesn't want yours.

Edit: And, c/o Daring Fireball, about 50% of people don't even have passcode security enabled on their phones.

Saturday, September 14, 2013

Google H Score, Year 5

For the fifth year in a row (who blogs for that long?), here's my tracking of the cites for some of my papers (since I don't have that many papers that are cited that much, it's not too difficult). Google's lack of table support is really annoying though. 3 cites or more only (this may make things difficult in the future, to be honest). My H score didn't move this year, need to get some of the lesser-cited papers moved up for that. I am happy to report that my Digital Elves paper has a cite, though!

Article (short title)Year20092010201120122013
Mechs of an online public sphere 
To broadband or not to...
Honey, I shrunk the world!
Playing Internet curveball...
Online org...
Tweets and votes...
A cross-national study...
Global citation patterns...
Technology as place
Strat and global elite theory
Copyright notices...

Values as of early September in 2009, 2010, 2011, 2012, and 2013.
x = mislabeled or missing from Google.
? = I didn't record it that year.

And, the previous years: Year 4Year 3Year 2Year 1.

Tuesday, September 10, 2013

Pants! The Hulk and EverQuest 2

Another case of intellectual property infringement by a giant IP company homage in a game. What's amusing here is that it is cross-genre, from fantasy to sci-fi, which is not as common as within-genre.

What we have is The Hulk's pants, which are in the original conception purple and torn, and The Hulk is kinda dumb but strong.

Edit: Oh look I blogged about these before, just without an image! Nice.

Sunday, August 11, 2013

User Evolved Practice

I don't actually know what the official phrase is (the term of art), but I bet there is one -- a practice in a system that the users create on their own, independently from the creators or maintainers of that system. Several weeks ago, someone had mentioned to me that users of the bike share program (which is found internationally and across the US, here's the NYC site) would, upon returning a bike that had a problem, reverse the seat. This acted as a sign to other users. This was not official, and somehow had to spread in terms of knowing what it meant (perhaps via forums). So I don't know all the details, but I finally saw one at the rack outside my office! (And no, this is not a staged photo -- I could have done that but where's the fun in that?)

Monday, July 29, 2013

Face to Face as Technology

Great article over at the New Yorker about the diffusion, not of technology, but of technique, which itself (I would argue) is a type of technology. (The author is Atul Gawande.) One quote which is particularly nice is...

People talking to people is still how the world's standards change.
And, an Ev Rogers quote, connected to the same idea, which is reinforced by the article:
Diffusion is essentially a social process through which people talking to people spread an innovation. [italics added]
The two quotes, which are not adjacent to each other, suggest the idea (backed up by research) that technology is not spread by technology but by sociology (to keep the -ology endings). More than technology, though, is technique, which some authors have written about (I am thinking about the STS researchers, like Hughes and others).

I almost forgot: Teaching is essentially a type of diffusion of technique. This is one of my hesitations about MOOCs.

Friday, July 19, 2013

My Next Computer

I don't have amazing Gimp skills, but hey here's my (pun intended) next computer...

(Orig image is one of the currently soon to be new Mac Pros, and of course, its new logo.)

Friday, June 28, 2013

Most Guilds Are Small

This finding was a little odd -- most guilds in two games I looked at were small, very small. Many were either as small as possible or smaller than the smallest group size that the code allows. Specifically, the two games were both Sony games, Planetside 2 and EverQuest 2. Sony provides a census for both games, so you can scrape all the guilds for EQ2 but the API for PS2 is different so you can't for outfits (guilds) there, but there was a secondary website which has an unknown but large sample of outfits in PS2.

This is somewhat odd because we generally consider the point of a guild, at least from the point of the developers, is to allow and foster long-term group cohesion among many players so those players can go and do things that you can't do in solo play or small group play. However, a lot of players apparently don't actually use guilds in this way at all. They are using them, I think (at least in EQ2 which I play), for personal satisfaction in a variety of ways. (Personal satisfaction of leveling, design their own guild hall, "I have a guild", a "home" for the player's quiver of alts if not in a larger guild, a guild for their family members...)

Google doesn't have great table support, but I'll put some data here anyways.

For PS2, having an outfit of two players... well I'm not really sure what that does (in terms of the benefit).

PS2 (Capped at 21 here in this table but there are many larger outfits.)

Outfit SizeHow Many
There were some much larger outfits, but you see the pattern (remember, this is an unknown and non-random sample).
Here's a graph, capped at 50 on the X-axis.

There were similar results for EQ2, here shown by number of accounts. (PS2 and EQ2 do characters and accounts a little differently.) This, a census not a sample, is the subset of guilds which I determined were "active" during a 4-week period a few months ago when I scraped the census. There are far more guilds registered in EQ2 but most aren't active. Most guilds are small, but most accounts are in a large guild. A "group", as defined in code, well a full group, is 6 people, so guilds with 1-5 people can't fill a full group on their own (although with the recent additions of mercenaries they can, sort of).

Note that the "Accounts" column is non-linear. This is for the following reason: Full group = 6, x2 raid = 12, full raid (x4) = 24. Once it hits 24 I scale it.

AccountsNo. of GuildsTotal Accts.

And, a visual image of all that, I think each bar width is 5 characters. (Oops initially I put in the wrong chart, it had characters, not accounts...) Ok I am not sure why there is a discrepancy between the table and the chart with regard to the first two categories... that is.... slightly disturbing, but overall the point still holds.

Both are long tail. But that people are making really small guilds, ones that are unusably small in terms of many types of gameplay, is decidedly of note.

Thursday, June 27, 2013

Academic Journals and Double Billing

So, we academics write articles and send them to journals. Other academics who are on the editorial boards may review them for review, or actually review them. Other academics also review the papers.

On the back end, academic libraries complain about the high prices that publishers charge academic libraries for those same journals.

Let's review who pays for the work of those academics in the first paragraph.

  • Time spent on research for an article, paid for by: University
  • Time spent writing an article, paid for by: University
  • Time spent doing whatever the ed board does, paid for by: University
  • Time spent reviewing articles, paid for by: University
You will note that at no point in this process does a typical journal publisher actually pay for any of the time spent working, it is all paid for by the university where the academic resides (or, perhaps a grant, which is still not the publisher).

I am probably missing a few nuances but, usually this kind of behavior is called double billing, and it is frowned upon.

Wednesday, June 26, 2013

Big Data Divide

About a month ago I was at CeDEM, the Conference for E-Democracy and Open Government, in Krems, Austria. Several projects were focused on communication between local municipalities and the citizenry, with a focus on communication and knowledge sharing that could go either way (such as, a municipality learning what local citizens don't know, and getting that information to them). So a lot of it concerned data collection and analysis.

That should sound familiar, given recent events. I pointed out how there were two sectors that already had the three needed factors:

  1. Data collection capability,
  2. Data (have been collecting),
  3. Data analysis capability.
These two sectors are, of course, intelligence agencies for at least the US and perhaps Britain (and other governments on a smaller scale, most likely), as well as the information companies themselves (in this case I mean information conduit providers, like Google, AT&T, and Verizon, to name a few).

Sure we could pass laws forcing these two sectors to help out local governments provide better services (one eternal promise of not just the internet, but of the information future -- it's always a day away), but I doubt that will happen. That's rather sad, and falls short of the "of the people, by the people, for the people" ideal from US President Lincoln.

Tuesday, June 25, 2013

The NSA and "Unreasonable"

The 4th Amendment to the US Constitution, part of the Bill of Rights (the first ten amendments, as a package, minus two), ratified in 1791.

Perhaps the, or a, pivotal section relies upon the word "unreasonable":

The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated...
Is it unreasonable to have the government have access to data that the giant info-companies already have access to and have in fact generated and searched through themselves? It certainly seems unparalleled, and I don't for the life of me see how it is actually legal, but given that other organizations have been doing this for years for their own good I have a bit of a hard time seeing how it is unreasonable to have some other people do the exactly the same thing as others while looking for terrorists instead of for their own capital gains.

People, especially at Wired Magazine, used to love to name-drop Bentham's panopticon back in the 1990s, but that idea hasn't been used much if at all in the recent dust up. Perhaps it seemed cool until it happened.

Edit: Last week's New Yorker lead article does mention Bentham, but since I was off at ICA in London I only saw it today. The article also astutely points out how we knew the NSA was doing this for years, and knew this years ago.

Saturday, March 23, 2013

More Game Homage

Two more interesting bits of homage from some games:

  • EverQuest II and Back to the Future
  • Minecraft and Portal
Most of the homage in EQII is fantasy related, but on occasion there is general nerd humor (like at least one for Superman and one for Star Trek, although there is also one from The Loveboat, which is unusual). So, sci-fi in fantasy, even fantasy like EQII and WoW which have steampunk-like gnomish tinkerers (when did this happen to allow these games to do this?), is unusual.

Back to the Future's flux capacitor in EQII:
(EQII link) (BttF link)
As for why it's a hat, I don't know, but there it is.

Everyone, even my 9 year old nephew, knows and loves Minecraft. Everyone should also love Portal, since it was amazing (as is Portal 2). But, Minecraft is strongly influenced by some other games (as are many things influenced by other things), among them one of my favorites Dwarf Fortress. Portal was not among them as far as I know, but here you have the infamous cake, which, as it turns out, really is a lie. (Or is it, yes, yes it is, but you can think it's as delicious and moist [at 2:18] as you like.)

The cake is no lie in Minecraft, except it is:

Copying helps us make things.

Thursday, January 31, 2013

Song and Dance

Given the connection between community and communication (as can be easily seen in the English words, from Latin), I somehow was clued into the importance of music and dance as community reinforcing activities. Communities are, in some ways, where a group of people coordinate their actions with each other -- the same is true of dance (and the music that goes along with it). So I had an "of course!" moment when I read this:

The Sesotho verb for singing (ho bina), as in may of the world's languages, also means to dance...
From Daniel Levitin's This Is Your Brain On Music, p. 7. I'm not a linguistics expert, and I know English (sing, dance) and Spanish (cantar, bailar), and as you can see neither of those clued me in.

Thursday, January 10, 2013

Online Dating Math

If 1 in 5 (20%) relationships start online, that suggests that 4 in 5 (80%) start offline. Get away from your screen and go out into the world! Or something. And "See Pics of Single Women", really? What is this, a porn ad? I don't see that the people could have thought about the math much, or they think their audience is not very good at math. And yes they are probably trying to highlight the word "Free" but it shouldn't be capitalized, and they should know this is the internet where there are people who can do both math and grammar and insist on them being done correctly.