Parallel R (Linux User & Developer Issue 108)

In the 108. issue of Linux User & Developer was an article “Supercharge your R experience” about using parallel techniques to analyse large amounts of data with R.

4 page step-by-step tutorial shows the basics needed for installing needed packages (multicore, Rmpi and SNOW) and doing basic stuff to give you idea of how setting up and running parallel configurations for R. It doesn’t take very deep approach but it gives you some ideas and points you to the packages needed for learning more. It’s also suitable for rookies since you really don’t have to know much R to be able to follow up.

If you’re interested about the topic, you should check the magazine for more. I’m not a regular reader of LU&D so I’m not sure how much they have R in their issues but I was happy to hear from a friend about this article.

You should also check out CRAN Task View page about High-Performance and Parallel Computing with R

MuroBBS Programming Challenge 1

It’s been a while since the last post. I started my studies as a statistics major in university and have been doing little with R but mostly just reading other blogs and taking our university’s R course which was just the basics so nothing new there.

I mentioned earlier that the Jets are back and they’re doing pretty nice. Teemu Selänne returned to Winnipeg to play against Jets a while ago and it was awesome night.

This post is about programming “challenge” in a Finnish computer themed forum MuroBBS  (in Finnish) where people write different programming challenges and others write solutions with different languages. I chose R since it’s not so familiar there.

First warm up was to create following sequence given the amount of maximum stars
*
**
***
****
*****
****
***
**
*
My code for that:

# MuroBBS Ohjelmointipähkinä 1
# http://murobbs.plaza.fi/1707865276-post2.html
stars <- function(n) {
sequence <- c(1:n, (n-1):1)
for (i in sequence) {
# Changed from print to cat, thanks for commenting Tal!
cat(rep('*', i), "\n")
}
}

When called, it gives output:

> stars(5)
*
* *
* * *
* * * *
* * * * *
* * * *
* * *
* *
*

However, I started thinking that is there a solution which doesn’t require a loop but that uses some of R’s own vector/list handling techniques?

edit
I wanna add Tony Breyal’s solution from comments as it’s pretty much superior to my unefficient for-loop:

stars2 <- function(n) {
myseq <- c(1:n, (n-1):1)
stars <- sapply(myseq, function(i) paste(rep("*", i), collapse = ""))
cat(stars, sep = "\n")
}

 

I’ve read about for-loop being really slow so after testing with Tony’s solution, I did little benchmarking to see that my solution really was terrible:

library(rbenchmark)
benchmark(stars(100), stars2(100), columns= c("test", "elapsed", "relative"))<strong>
</strong>
#       test    elapsed    relative
#1  stars(100)   20.72       26.5641
#2 stars2(100)    0.78       1.0000

So as you can see, mine was over 26 times slower than Tony’s.

More challenges and solutions in later posts.

NHL Statistics – Goals scored by age

NHL Statistics, part 1

Goals scored by age

Data Twirling blog gave instructions to how to get NHL statistics data from the website and I saw an opportunity to learn R and statistics with help of data I know and understand so it would be easier to see when my graphs show bad information.

When talking about hockey, the scoring is one the most important thing and everybody loves scorers – at leats as long they are on their favourite teams. After Teemu Selänne of Anaheim Ducks played magnificient season at age of 40 I wanted to know what kind of relevance does a age have on scoring and why.


Upper chart is the season 2010-2011 and the lower chart is seasons 1960-2011. The red dots are the median of goals scored by the age, red line is median of all ages, blue dots are the maximum goals scored by the age and the blue line is 50 goals which is kind of a milestone for scoring in one season.

Before publishing this I went through many different combinations and one interesting thing was that in seasons 2000 to 2011, players at ages of 26 to 29 seemed to play much worse than players at the ages of 30 to 33 by median.

I was already going to make an assumption that players had their best years after they turned 30 (I still didn’t have the max goals in the chart). Then I started thinking about and realised that there is a reason why < 20 yrs olds and >30 yrs olds have higher medians. Player who’s 18 and plays in the NHL has to be good or he wouldn’t be there. So it rises the median when there are only few players and they’re all best of their age. Later, starting from 20 more and more standard players come along as teams need players in addition to superstars. Same goes when player turns 30. If you are not a good enough player, you’ll be sold away or sent to AHL and replaced with younger stars. So only a few good players stay and medians keep rising. In their 40s there are players who play because they are good (like our own Teemu Selänne) and players who play because they’re some kind of loyal franchise players who none will fire or sell.

The basic code is adapted from Data Twirling blog I mentioned in the beginning and I just add my own things. So credit goes there.

SkaterStats.R is from the blog and I made no changes there

#######################################################################################
# Function to scrape season skater statistics from Hockey-reference.com
#######################################################################################
GrabSkaters <- function(S) {

# The function takes parameter S which is a string and represents the Season
# Returns: data frame

require(XML)

## create the URL
URL <- paste("http://www.hockey-reference.com/leagues/NHL_",
S, "_skaters.html", sep="")

## grab the page -- the table is parsed nicely

#for reading from Internet
tables <- readHTMLTable(URL)

#for reading local file so the servers won't get extra load
#tables <- read.csv('NHL.csv')

ds.skaters <- tables$stats

## determine if the HTML table was well formed (column names are the first record)
## can either read in directly or need to force column names
## and

## I don't like dealing with factors if I don't have to
## and I prefer lower case
for(i in 1:ncol(ds.skaters)) {
ds.skaters[,i] <- as.character(ds.skaters[,i])
names(ds.skaters) <- tolower(colnames(ds.skaters))
}

## fix a couple of the column names
colnames(ds.skaters)
## names(ds.skaters)[10] <- "plusmin"
names(ds.skaters)[11] <- "plusmin"
names(ds.skaters)[18] <- "spct"

## finally fix the columns - NAs forced by coercion warnings
for(i in c(1, 3, 6:18)) {
ds.skaters[,i] <- as.numeric(ds.skaters[, i])
}

## convert toi to seconds, and seconds/game
## ds.skaters$seconds <- (ds.skaters$toi*60)/ds.skaters$gp

## remove the header and totals row
ds.skaters <- ds.skaters[!is.na(ds.skaters$rk), ]
## ds.skaters <- ds.skaters[ds.skaters$tm != "TOT", ]

## add the year
ds.skaters$season <- S

## return the dataframe
return(ds.skaters)

median_of_goals.R is partly from the blog and partly my writing

## Creates a plot of goal medians by age of wanted season.
## uses SkaterStats.R by Data Twirling blog (http://www.brocktibert.com/blog/)

# Source the file with the Grab Skaters function
library("ggplot2")
source("SkaterStats.R")

#-----------------------------------------------------------------------
# Use the function to loop over the seasons and piece together
#-----------------------------------------------------------------------

## define the seasons -- 2005 dataset doesnt exist
## if I was a good coder I would trap the error, but this works
SEASON <- as.character(c(1960:2004,2006:2011))

## create an empy dataset that we will append to
dataset <- data.frame()

## loop over the seasons, use the function to grab the data
## and build the dataset
for (S in SEASON) {

require(plyr)

temp <- GrabSkaters(S)
dataset <- rbind.fill(dataset, temp)
print(paste("Completed Season ", S, sep=""))

## pause the script so we don't kill their servers
Sys.sleep(10)

}

dataset <- dataset[dataset$tm != 'TOT', ]

## UNTIL HERE, CODE IS FROM DATA WHIRLING BLOG
## STARTING FROM HERE IS MY CODE

## select by season - I have both all time and latest season stats done at once
alltime <- sqldf("SELECT * FROM dataset")
age2011 <- sqldf("SELECT * FROM dataset WHERE season=2011")

## sort agelist by age
age2011 <- age2011[with(age2011, order(age2011$age)),]
alltime <- alltime[with(alltime, order(alltime$age)),]

## Create a dataframe with unique ages
dfage <- data.frame(unique(age2011$age))
dfage.alltime <- data.frame(unique(alltime$age))

## rename column
names(dfage) <- "age"
names(dfage.alltime) <- "age"

# Count medians of goals by age
gmedians <- tapply(age2011$g, age2011$age, median)
alltime.gmedians <- tapply(alltime$g, alltime$age, median)

# Add gmedians to dfage data frame
dfage$gmedians <- gmedians
dfage.alltime$gmedians <- alltime.gmedians

# Modify the plot theme
th = theme_bw()
th$panel.background
theme_rect(fill = "white", colour = NA)
th$panel.background = theme_rect(fill = "white", colour = NA)
theme_set(th)

#Count maximum scored goals by age
gmax <-tapply(age2011$g, age2011$age, max)
dfage$gmax <- gmax

alltime.gmax <-tapply(alltime$g, alltime$age, max)
dfage.alltime$gmax <- alltime.gmax

# Create the plot
plot <- ggplot(dfage, aes(x=age, y=gmedians)) + geom_point(colour="red") +
geom_point(aes(x=dfage$age, y=dfage$gmax), colour="blue") +
geom_hline(yintercept = mean(dfage$gmedians), colour="red", size=0.5) +
geom_hline(yintercept = 50, colour="blue", size=0.5) +
scale_x_continuous("Age", breaks=c(min(dfage$age):max(dfage$age))) +
scale_y_continuous("Median of goals", breaks=c(min(dfage$gmedians):max(dfage$gmax))) +
opts(title="NHL 2011 Season - Median of goals by age", breaks=c(min(dfage$gmedians):max(dfage$gmax)))

alltime.plot <- ggplot(dfage.alltime, aes(x=age, y=gmedians)) + geom_point(colour="red") +
geom_point(aes(x=dfage.alltime$age, y=dfage.alltime$gmax), colour="blue") +
geom_hline(yintercept = mean(dfage.alltime$gmedians), colour="red", size=0.5) +
geom_hline(yintercept = 50, colour="blue", size=0.5) +
scale_x_continuous("Age", breaks=c(min(dfage.alltime$age):max(dfage.alltime$age))) +?
scale_y_continuous("Median of goals", breaks=c(min(dfage.alltime$gmedians):max(dfage.alltime$gmax))) +
opts(title="NHL 1960-2011 Seasons - Median of goals by age")

googleVis library on use

Data on the map

While surfing around the Internet I accidentally found the googleVis library for R and especially the gvisGeoMap-function which creates a map based on country data.  In a table hockey scene we have a great World Ranking system which pretty much tells you who’s the top dog and also who are the active players ’cause tournaments expire in 24 months.

So I took a closer look to the googleVis/gvisGeoMap and found it out very straight-forward to use and it got great response from the players around the world.

The problem I faced with the data was that it used country codes like FIN, SWE, RUS, GBR, LAT and so on and the gvisGeoMap doesn’t recognize them so I had to write small script to recode those. If there is a better and more efficient way to do it, please comment ’cause I believe there is.

dynamic_players_to_map.R:

## Using the google visualization API with R
## Creates a map of table hockey players by countries. Only selects players who has a World Ranking entry (at least 1 point during 2 years)
## @author rocknrblog
## requires 'rename_countries.R'
## Version 0.2, 21.6.2011
## Feel free to use and modify

# Loads googleVis-library needed for map creation
library(googleVis)

# Reads the World Ranking file
input<- read.table("http://ithf.info/stiga/ithf/ranking/ranking.txt", sep="\t", header=TRUE, skip=1)

# Convert nation code list to a List data type (I had to do this so I could do the recoding/renaming)
nat <- as.matrix(input$Nation)

# Rename the nation-data with corresponding country names listed on the file
source('rename_countries.R')

nat <- as.factor(nat)

#Nation codes to dataframe 'df'
nation <- data.frame(x = nat)

# Frequencies of nations' players
ranking <- as.data.frame(table(nation), stringsAsFactor=FALSE)

# Create Map-dataframe of nations and frequencies
Map<- data.frame(ranking$nation, ranking$Freq)

# Name Map's attributes
names(Map)<- c("Country", "Number of Players")

# Create a map as gvisGeoMap
Geo=gvisGeoMap(Map, locationvar="Country", numvar="Number of Players", options=list(height=600, width=800, dataMode='regions'))

# Plot the map graphics file as HTML/JS
plot(Geo)

And the rename_countries.R:

## Tool script for recoding ITHF WR country codes to country names understood by googleVis
## @author Juha-Matti Santala
## Version 0.2, 21.6.2011

nat <- replace(nat, nat=="GBR", "United Kingdom")
nat <- replace(nat, nat=="FIN", "Finland")
nat <- replace(nat, nat=="RUS", "Russia")
nat <- replace(nat, nat=="AFG", "Afghanistan")
nat <- replace(nat, nat=="ALB", "Albania")
nat <- replace(nat, nat=="AUS", "Australia")
nat <- replace(nat, nat=="AUT", "Austria")
nat <- replace(nat, nat=="BLR", "Belarus")
nat <- replace(nat, nat=="CAN", "Canada")
nat <- replace(nat, nat=="CHN", "China")
nat <- replace(nat, nat=="CRO", "Croatia")
nat <- replace(nat, nat=="CZE", "Czech Republic")
nat <- replace(nat, nat=="EST", "Estonia")
nat <- replace(nat, nat=="DEN", "Denmark")
nat <- replace(nat, nat=="FRA", "France")
nat <- replace(nat, nat=="GER", "Germany")
nat <- replace(nat, nat=="HUN", "Hungary")
nat <- replace(nat, nat=="IND", "India")
nat <- replace(nat, nat=="ITA", "Italy")
nat <- replace(nat, nat=="JAP", "Japan")
nat <- replace(nat, nat=="KAZ", "Kazakhstan")
nat <- replace(nat, nat=="LAT", "Latvia")
nat <- replace(nat, nat=="LIB", "Lebanon")
nat <- replace(nat, nat=="LTU", "Lithuania")
nat <- replace(nat, nat=="NED", "Netherlands")
nat <- replace(nat, nat=="NOR", "Norway")
nat <- replace(nat, nat=="PAK", "Pakistan")
nat <- replace(nat, nat=="ROM", "Romania")
nat <- replace(nat, nat=="SRB", "Serbia")
nat <- replace(nat, nat=="SVK", "Slovakia")
nat <- replace(nat, nat=="SLO", "Slovenia")
nat <- replace(nat, nat=="KOR", "South Korea")
nat <- replace(nat, nat=="ESP", "Spain")
nat <- replace(nat, nat=="SUD", "Sudan")
nat <- replace(nat, nat=="SWE", "Sweden")
nat <- replace(nat, nat=="SUI", "Switzerland")
nat <- replace(nat, nat=="UKR", "Ukraine")
nat <- replace(nat, nat=="USA", "United States")

rename_countries.R is definitely not pretty and it looks stupid but I found out no other way to do it and I wanted to get some graphs working.

You can find the map in use here. Next thing I’m planning is some kind of a visualisation of the history and development of player counts in the world so players could see how countries have grown or shrunk during the years.

Thank godness it’s Jets!

 NHL Entry Draft 2011

NHL Entry Draft for 2011 is over and what a two-day it was. Of course the biggest news was the name of Winnipeg’s new franchise and oh how happy I am for it to be Jets. Since Teemu Selänne started playing there in 1992 (being #10 overall draft in 1988), Jets has had a place in my heart. I just hope they’ll take the old 1992 logo back.

About the actual draft this year, there are few players I’m interested about to see how they’ll play in NHL. First of all of course #1 Ryan Nugent-Hopkins who got drafted to Edmonton Oilers and #2 Gabriel Landeskog (Colorado Avalanche).  While the whole world looks for the top guys drafted, I’m interested in lower rank draftees and Finns. #16 of Buffalo Sabres, Joel Armia from Porin Ässät is one great player of what I think is the new golden generation of players like Mikael Granlund (Minnesota #9, 2010), Joonas Nättinen (Montreal #65, 2009) and Joonas Donskoi (Florida #99, 2010). After maybe a year still developing in Finnish Elite League, I’m confident that Armia will break to Sabres’ roster.

Because of the Winnipeg Jets, their first round pick #7 Mark Scheifele got my attention and one Canadian youngster I really like is second round pick #32 of St. Louis Blues, Ty Rattie.

Joe Morrow, 1st round pick of Penguins

Pittsburgh Penguins

I’ve been fan of Penguins since early 90’s when I first started following NHL. There was pretty much no change for a young kid to watch NHL games in Finland early 90’s and the Internet was just coming so it was really much about video games and some hockey magazines that were available and Pens’ super duo Lemieux – Jagr became my favourites. New rising after picking Geno Malkin #2 2004 and Sidney Crosby #1 2005 has been a rocketing through the roof. This year Pens didn’t have any real good spots to pick and they mostly looked for defence men with Joe Morrow #23 and Scott Harrington #54. #174 Josh Archibald was a good choice for late pick winger and hopefully he’ll become a part of the NHL roster.

Future

Season 2011-2012 will start in few months and hopefully the season will be little brighter for the Pens than the last with both Malkin and Crosby getting injured and Pens’ season ending little too early.

Table hockey and statistics, part 1

Two hobbys combined

I’ve been playing table hockey for little over 8 years now and since the very beginning I’ve been interested in statistics and different software we use while organising tournaments and seasons. I’ve been writing few softwares for tournament/season statistics with Java and Perl so when I started learning R, table hockey statistics were the number one choice for data to start learning with.

Finnish Table Hockey Association organizes 7 tournaments per year as a ranking system and to date there has been 125 tournaments with over 600 different players participating so it gave me good starting point for my first R script.

Data is 125 txt-files with bare names listed in order of placements. As I didn’t (and yet don’t) know how to create data frames out of those, I created small Java program to create csv-files for me so it was easy to move on to R.

Placements.java


/** Creates CSV files for each player on placement files
* @author rocknrblog.wordpress.com
* @version 0.9b
* License: This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
*/

import java.io.*;
import java.util.*;

public class Placements {

private static Scanner RD = new Scanner(System.in);
private static ArrayList<Player> players = new ArrayList<Player>();
private static ArrayList<Player> temporary = new ArrayList<Player>();

public static void main(String[] args) {

  System.out.println("Nro of tournaments:");
  int nroOfTournaments = RD.nextInt();

  for (int i = 1; i <= nroOfTournaments; i++) {
    createPlayersList(i + ".txt");
    System.out.println("Players read from file " + i + ".txt");
  }

  for (int i = 1; i <= nroOfTournaments; i++) {
    readPlayers(i + ".txt");
    System.out.println("Placements read from file " + i + ".txt");
  }

  writePlayers();
}

/**
* Reads a file given as argument and reads all the player names to a ArrayList
* @.pre filename != null
*/

public static void createPlayersList(String filename) {
  try {
    BufferedReader bufReader = new BufferedReader(
                                   new InputStreamReader(
                                   new FileInputStream(filename)));
    int i = 0;
    while(bufReader.ready()) {
      String line = bufReader.readLine();

      if (!isFound(line, players)) {
        players.add(new Player(line));
      }
      i++;
    }
    bufReader.close();
  }
  catch(Exception e) {
    System.out.println("Error: " + e);
  }
}

/**
* Reads players from placement lists to arraylist with placements
* @.pre filename != null
*/

private static void readPlayers(String filename) {
  for (int i = 1; i <= players.size(); i++) {

    Player p = players.get(i-1);
    int sijoitus = getPlacement(p.name(), filename);
    p.addPlace(sijoitus);
  }
}

/**
* Returns placement of wanted player on wanted file
* @.pre player != null & isFound(player, players) & fname!=null
*/

private static int getPlacement(String player, String fname) {
  int i = 0;
  try {
    BufferedReader bufReader = new BufferedReader(
                                   new InputStreamReader(
                                   new FileInputStream(fname)));
    i = 1;
    while(bufReader.ready()) {
      String line = bufReader.readLine();
      if (player.equals(line)) {
        return i;
      }
      i++;
    }
    bufReader.close();
    return 0;

  }

  catch(Exception e) {
    System.out.println("ny tuli virhe: " + e);
  }
  return i;
}

/**
* Writes players' placement information into csv-file for R
* @.post a file named [player_name].csv is created
*/

private static void writePlayers() {
  try {
    for (int k = 0; k < players.size(); k++) {
      String filename = players.get(k).name() + ".csv";
      FileOutputStream printStream = new FileOutputStream(new File(filename));
      PrintWriter writer = new PrintWriter(printStream, true);
      writer.println("turnaus, sijoitus");
      writer.print(players.get(k));
      System.out.println(players.get(k).name() + " kirjattu");
      writer.close();
    }
  }
  catch (IOException e) {
    System.out.println(e);
  }
}

/**
* Looks if player is already found on the list
* @.pre name != null
*/

private static boolean isFound(String name, ArrayList<Player> list) {
  for (int j = 0; j < list.size(); j++) {
    if (list.get(j).name().equals(name))
      return true;
  }
  return false;
}

/**
* Gets player from the list
* @.pre name != null
*/

private static Player getPlayer(String name) {
  for (int j = 0; j < players.size(); j++) {
    if (players.get(j).name().equals(name))
      return players.get(j);
  }
  return null;
}
}

After this I wanted to have a program to combine multiple players to one csv-file for easy use on R:

CombineCSV.java

/** Creates a combined CSV file of chosen players
* @author rocknrblog.wordpress.com
* @version 0.9b
* License: This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
*/

import java.io.*;
import java.util.*;

public class CombineCSV {

private static ArrayList<String> players = new ArrayList<String>();
private static ArrayList<Integer> placements = new ArrayList<Integer>();
private static ArrayList<Integer> tournaments = new ArrayList<Integer>();

/**
* If ran without command line parameters, it will read players from file 'players.txt'. If run with clp, it'll choose those players and create 'allplayers.csv' with combined data
*/
public static void main(String[] args) {
  if(args.length == 0) {
    readPlayers();
  }
  else {
    for (int i =0; i < args.length; i++) {
      players.add(args[i]);
    }
  }
  readOriginals();
  writeNewCSV();
}

/**
* Reads players from players.txt file
*/

private static void readPlayers() {
  try {
    String filename = "players.txt";
    BufferedReader bufReader = new BufferedReader(
                                   new InputStreamReader(
                                   new FileInputStream(filename)));

    while(bufReader.ready()) {
      players.add(bufReader.readLine());
    }
  }

  catch (Exception e) {
    System.out.print("readPlayers: " + e);
  }
}

/**
* Reads the player's original CSV file
*/
private static void readOriginals() {
  try {
    for (int i = 0; i < players.size(); i++) {
      String filename = players.get(i) + ".csv";
      BufferedReader bufReader = new BufferedReader(
                                 new InputStreamReader(
                                 new FileInputStream(filename)));
      int k = -1;
      while(bufReader.ready()) {
        k++;
        String line = bufReader.readLine();
        String[] data = line.split(",");
        if (k == 0) continue;
        if (i == 0)
          tournaments.add(Integer.parseInt(data[0]));
          placements.add(Integer.parseInt(data[1]));
      }
    }
  }

  catch(Exception e) {
    System.out.println("ny tuli virhe: " + e);
  }
}

/**
* Writes a new CSV file with combined data
*/

private static void writeNewCSV() {
  try {
    String filename = "allPlayers.csv";
    FileOutputStream printStream = new FileOutputStream(new File(filename));
    PrintWriter writer = new PrintWriter(printStream, true);
    /* Creating a caption line */
    String caption = "tournament,";

    for (int i=1; i <= players.size(); i++) {
      System.out.println(players.get(i-1));
      caption += players.get(i-1);
      if (i < players.size()) caption += ",";
    }
    writer.println(caption);
    int s = placements.size() / players.size() ;

    for (int k = 1; k <= tournaments.size(); k++) {
      String line = k + ",";
      for (int p = k-1; p < placements.size(); p+= s) {
        if (placements.get(p) != 0) {
          line += placements.get(p) + ",";
        }
        else line += "NA,";
      }
      line = line.substring(0,line.length()-1);
      writer.print(line);
      if (k < tournaments.size()) { writer.print("\n"); }
    }
    writer.close();
  }

  catch (Exception e) { System.out.println("writeNewCSV: " + e);} }
}
}

After this we get to fun stuff and to R:

## Creates plot of players' progress on tournaments
# Load ggplot2 for creating the plot
library("ggplot2")

# Read the data
tournamentdata <- read.csv('allPlayers.csv')

# Combine data according to column "tournament"
plotdata <- melt(tournamentdata, id="tournament")

# Theme the plot background to be white instead of default gray
th = theme_bw()
th$panel.background
theme_rect(fill = "white", colour = NA)
th$panel.background = theme_rect(fill = "white", colour = NA)
theme_set(th)

#Define plot title
mainTitle <- "Placements on FTHA Ranking tournaments"

#Create the plot with reversed y-axis (for the 1st place to be on the top) and two lines (one on 8th place for the playoffs and other on 16th place for the final group)
plot <- ggplot(plotdata, aes(x=tournament, y=value, color=variable)) + geom_point() + geom_line(data=plotdata) + scale_y_reverse("Placement") + geom_hline(yintercept=8, colour="red", size=1.2)  + geom_hline(yintercept=16, colour="blue", size=1.2) + scale_x_continuous("Tournament", breaks=c(min(plotdata$tournament):max(plotdata$tournament) + coord_cartesian(xlim=c(0,127), ylim=c(0,70)) + opts(title = mainTitle)

# Print out the plot
print(plot)

Plot:

Plot of four players

Progress of Nuttunen, Lampi, Ollila and Kantola (click for bigger)

msg <- "Hello World!"; print(msg)

Introduction

My mom always told me to be polite so I think I should introduce myself first. I’m a 20+ years old Finnish university student studying computer science (2nd year starting) and statistics (first year coming) and mostly interested in sports, statistics and programming – best when combined.

Why?

Because of my computer background I got interested in R programming language and started studying it and decided to start a blog inspired by dozens of blogs in R-bloggers.com. This blog also works like a place for me to empty my mind and perhaps get new ideas when rethinking what I’ve done and what I should do with R in the future.

As a beginner I’ll take any advice and corrections offered ’cause that’s the best way to learn more efficient and better coding. My studying style of try-google-try-google more -succeed doesn’t always produce the best and optimal code but it’s a way to get things done and running.

Most of my blog entries will be about R but also something about studying and some other programming languages as well. I have some knowledge of Python, Perl and Java.

R-Bloggers.com

This blog is also at R-bloggers.com which is a awesome blog service where you can find more interesting posts about R and statistics. It inspired me so why don’t you give it a try?