visualising large amounts of hourly environmental data

It is Sunday, it's raining and I have a few hours to spend before I am invited for lunch at my parents place. Hence, I thought I'd use the time to produce another post. It has been a while since the last post as I have been in Africa for about two months for yet another stint of fieldwork on the great Kilimanjaro mountain…

This time I want to introduce the strip() function which is part of the METVURST suite for plotting meteorological/climatological data (available from https://github.com/tim-salabim/metvurst)

This function is intended to facilitate two things:

  1. to enable plotting of large (climatological) data sets at hourly resolution (the data need to be hourly observations) in a reasonably well defined space (meaning that you won't need pages and pages of paper to print the results).
  2. to display the data in a way that interpretation is possible from hourly to decadal time scales.

The function is much like the calendarHeat() function from Revolution Analytics (see here) though, as mentioned above, it produces plots of hourly data. In fact, calendarHeat() would be a good alternative to use when you intend to plot daily values.
strip() is implemented in lattice (needs latticeExtra to be more precise) and essentially plots a levelplot of hour of day (y-axis) vs. day of year (x-axis).

In detail the function takes the following parameters:

  • x (numeric): Object to be plotted (e.g. temperature).
  • date (character or POSIXct): Date(time) of the observations. Format must be 'YYYY-MM-DD hh:mm(:ss)'
  • fun: The function to be used for aggregation to hourly observations (if original is of higher frequency). Default is mean.
  • cond (factor): Conditioning variable e.g. years.
  • arrange (character): One of “wide” or “long”. Defaults to “long” which provides a layout that is easier to interpret, however, “wide” is better for use in presentation slides.
  • colour (character): a vector of color names. Defaults to rev(brewer.pal(11, “Spectral”))
  • …: Further arguments to be passed to levelplot (see ?lattice::levelplot for options). This can be quite handy to set the legend title using main = "YOUR TEXT HERE" (the legend is plotted on top of the actual plot).

Here's an example using hourly data from Fiji (Station is at Lautoka on the west coast). The data can be freely accessed through the Bureau of Meteorology in Australia (BOM).

First we need to download the data and do some reshaping

## LOAD RColorBrewer FOR COLOR CREATION
library(RColorBrewer)

## SET URL FOR DATA DOWNLOAD
url <- "http://www.bom.gov.au/ntc/IDO70004/IDO70004_"

## YEARS TO BE DOWNLOADED
yr <- 1993:2012

## READ DATA FOR ALL YEARS FROM URL INTO LIST
fijilst <- lapply(seq(yr), function(i) {
  read.csv(paste(url, yr[i], ".csv", sep = ""), na.strings = c(-9999, 999))
  })

## TURN LIST INTO COMPLETE DATAFRAME AND CONVERT NA STRINGS TO NAs
fiji <- do.call("rbind", fijilst)
fiji[fiji == -9999.00] <- NA
fiji[fiji == -9999.0] <- NA
fiji[fiji == 999.0] <- NA

## CREATE POSIX DATETIME AND CONVERT UTC TO LOCAL FIJI TIME
dts <- as.POSIXct(strptime(fiji$Date...UTC.Time, 
                  format = "%d-%b-%Y %H:%M")) + 12 * 60 * 60

## CREATE CONDITIONING VARIABLE (IN THIS CASE YEAR)
year <- substr(as.character(dts), 1, 4)

Now, let's plot the temperatures

## SOURCE FUNCTION (ALSO AVAILABLE AT https://github.com/tim-salabim/metvurst)
source("http://www.staff.uni-marburg.de/~appelhat/r_stuff/strip.R")

## PLOT STRIP FOR TEMPERATURE
strip(x = fiji$Air.Temperature, 
      date = dts,
      cond = year,
      arrange = "long",
      main = "Temperature")
## 
##  Module   :  strip 
##  Author   :  Tim Appelhans <tim.appelhans@gmail.com>, Thomas Nauss 
##  Version  :  2012-01-06 
##  License  :  GNU GPLv3, see http://www.gnu.org/licenses/

plot of chunk unnamed-chunk-2

We can clearly see both the diurnal and the seasonal signal.
Looking at other parameters such as wind direction and speed should also give an interesting insight into the seasonal and diurnal dynamics of the location.

I leave it up to you to explore this further…

I am off to a nice sunday roast now! yumyum

sessionInfo()
## R version 2.15.3 (2013-03-01)
## Platform: x86_64-pc-linux-gnu (64-bit)
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=C                 LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] grid      parallel  stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] reshape_0.8.4       plyr_1.8            latticeExtra_0.6-19
##  [4] lattice_0.20-13     RColorBrewer_1.0-5  RWordPress_0.2-3   
##  [7] rgdal_0.8-5         raster_2.0-41       sp_1.0-5           
## [10] knitr_1.1          
## 
## loaded via a namespace (and not attached):
## [1] digest_0.6.3   evaluate_0.4.3 formatR_0.7    markdown_0.5.4
## [5] RCurl_1.95-3   stringr_0.6.2  tools_2.15.3   XML_3.95-0    
## [9] XMLRPC_0.2-5
About these ads
This entry was posted in climatology, R, visualisation. Bookmark the permalink.

10 Responses to visualising large amounts of hourly environmental data

  1. Thanks for sharing your ideas and code.
    Perhaps it would be very useful to know what is already available in the package “OpenAir”.
    I think it covers a very similar functionality like the one of this post.

    The project’s site is here:
    http://www.openair-project.org/

    And it is also very worth to read their very well designed manual:
    http://www.openair-project.org/Downloads/OpenAirManual.aspx

  2. tonycaine says:

    in post looks like you call strip() with

    dts <- as.POSIXct(strptime(fiji$Date…UTC.Time,
    format = "%d-%b-%Y %H:%M")) + 12 * 60 * 60)

    strip(x = fiji$Air.Temperature,
    date = dts,
    cond = year,
    arrange = "long",
    main = "Temperature")

    this has an error — get
    Error in as.POSIXlt.character(x, tz, …) :
    character string is not in a standard unambiguous format

    however the source to strip () says date format is
    Format must be 'YYYY-MM-DD hh:mm(:ss)'

    and following call works;

    strip(x = fiji$Air.Temperature,
    date = as.character(dts),
    cond = year,
    arrange = "long",
    main = "Temperature")

    And GREAT chart.
    Going to get local weather (rather than Fiji) and have a play.

    be well
    Tony

  3. ABC says:

    AMAZING

  4. Alexander says:

    Hi, I like very much this package, but I have some questions:

    1. How can I plot an annual mean plot?
    For example I would like to know if is possible to plot the mean of all years in one contour plot, because to present my results I would like to have only a plot with annual mean values and not year per year.

    2. How can I plot more than two plots per page?
    It is because I have data from two different climate stations, and I would like for example to plot the two annual means precipitations in one page but sharing the same x-scale and palette.

    Thanks,
    Alexander

    • Tim Salabim says:

      Hi Alexander,
      1. I am not quite sure if I understand your question correctly. An annual mean is usually only one number and for that you wouldn’t need a plot. In case you wanted to plot an average year on an hourly basis, i.e. the average for each hour of day during each day of the year, then you would need to aggregate your data accordingly. If, for example, your datetime variable looks like this 2000-01-01 00:00, 2000-01-01 01:00, 2000-01-01 02:00, … then you could create a vector where you delete the first 4/5 digits (the year) of your datetime variable so that you are left with something like this 01-01 00:00, 01-01 01:00, 01-01 02:00, … which you could then use to aggregate your data. The aggregated data can then be plotted using the strip function.

      2. If you want to plot 2 stations you could simply use the station identifier as teh conditioning variable (instead of year in the above example).

      I hope this clarifies,

      Tim

  5. Alexander says:

    Hi Tim,

    Thanks for your reply, in the first question you are right, I wanted tell you an Average year on an hourly basis plot.

    But for the second question what I mean, if is possible to plot multiples graphs together like this?:
    http://bosquesplus.com/varios/test_Hum-Tem.JPG

    I made this figure with photoshop but I would like to know is is possible to do with R?

    and finally, can you tell me please how can I export this graphs in High Resolution for a paper.

    Best regards,
    Alexander

    • Tim Salabim says:

      Alexander,
      yes this is possible in R. First of all you would need a data frame that has a station ID (in your case ‘valley’ and ‘peak’) column. Then you can use this column as conditioning variable (cond = stationID).
      If you then produce one plot object for temperature (eg. plot.temp) and one for humidity (eg. plot.hum) you could then use the grid package to set up the plotting area accordingly. The code for doing this (including a hi-res png output) would be something like this:

      
      ### set up and open graphics device
      png("my_plot.png", width = 1024 * 3, height = 748 * 3, res = 300)
      
      ### clear plot area
      grid.newpage()
      
      ### define first plotting region (viewport)
      vp1 <- viewport(x = 0, y = 1, 
                      height = 0.5, width = 1,
                      just = c("left", "top"),
                      name = "top")
      
      ### enter vp1 
      pushViewport(vp1)
      
      ### plot a plot - needs to be printed (and newpage set to FALSE)!!!
      print(plot.hum, newpage = FALSE)
      
      ### leave vp1 - up one level (into root vieport)
      upViewport(1)
      
      ### define second plot area
      vp2 <- viewport(x = 0, y = 0, 
                      height = 0.5, width = 1,
                      just = c("left", "bottom"),
                      name = "bottom")
      
      ### enter vp2
      pushViewport(vp2)
      
      ### plot another plot
      print(plot.temp, newpage = FALSE)
      
      ### leave vp2
      upViewport(1)
      
      ### close graphics device
      dev.off()
      
      

      For more details on how to plot multiple plots on one page see section 3 in this tutorial
      http://teachpress.environmentalinformatics-marburg.de/2013/07/creating-publication-quality-graphs-in-r-7/

      Hope that helps,

      Tim

  6. Alexander says:

    Hi Tim, thanks for your explanation.

    I was trying it, and I saw that I need to create an Lattice/ggplot2 Object to print my plots. Now my question is how can I create an object of my plot?

    I tried to create an object, to use it here:
    print(plot.hum, newpage = FALSE)

    but it doesn’t works, because I create it like this:

    ###### object plot.hum
    plot.hum upViewport(1)
    Error in UseMethod(“depth”) :
    no applicable method for ‘depth’ applied to an object of class “NULL”
    >

    Can you help me with some direction to do this?

    My best regards,
    Alexander

    • Tim Salabim says:

      Hey Alexander,
      I need to incorporate some changes into the package for this to work. I will post a solution to your question here in a few days.

      Cheers
      Tim

  7. Pingback: Update to metvurst | metvurst

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s