Thursday, December 27, 2012

URL parameter specification (effective v13)

Last updated: 11/20/2013


To tell which genome assembly to use
parameter namegenome
valueone of hg19, mm9, dm3, tair10, danRer7, spombe201203, AGPv2
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19
notewhen this parameter is used alone, it will load up a blank Browser (with no tracks displayed)
this parameter must be used along with all the rest of the parameters


To retrieve a saved session
parameter namesession
valuethe session ID
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19&session=your_session_here 
noteby using this parameter all the following parameters will be neglected


To show data over a specific genomic location
parameter namecoordinate
valuecoordinate must be in form of "chr1:123-456"
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19&coordinate=chr7:26663835-28123541
notethis parameter cannot be used when parameter "geneset" is present


To show data over a gene set (or set of genomic intervals)
parameter namegeneset
valuea list of gene symbols, or coordinates, all joined by comma
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19&geneset=chr7:26675831-28111542,chr7:101818721-102732764,cyp2c19
notethis parameter cannot be used when parameter "coordinate" is present


To run genomic juxtaposition on a BED track
parameter namejuxtapose
valueif the bed track is native, use internal track name (e.g. "refGene")
else if it is custom bed track, use track URL, a second parameter "juxtaposecustom=on" must also be supplied
examplenative: http://epigenomegateway.wustl.edu/browser/?genome=hg19&juxtapose=refGene
custom: http://epigenomegateway.wustl.edu/browser/?genome=hg19&juxtapose=http://vizhub.wustl.edu/hubSample/hg19/bed.gz&juxtaposecustom=on
note


To display a tabular datahub (deprecated)
parameter namedatahub
valueURL of the hub descriptor file
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19&datahub=http://vizhub.wustl.edu/hubSample/hg19/hub2.txt
noteonly one hub URL can be used


To display a JSON datahub
parameter namedatahub_jsonfile
valueURL to the JSON file
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19&datahub_jsonfile=http://vizhub.wustl.edu/hubSample/hg19/hub.json
noteonly one hub URL can be used
Refer to this post about how to prepare your JSON datahub.


To add native heatmap tracks (bedgraph/bigwig/categorical)
parameter namehmtk
valuetrack names joined by comma
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19&hmtk=GSM469970,GSM521901,GSM521895,GSM521897,GSM469974,GSM521913,GSM469968,GSM521889,GSM945297_2,GSM608165_1,GSM733692_1,GSM788085_1,GSM733776_1,GSM607494_1,GSM945228_2,GSM945228_1
noteuse internal track name, but not the label printed on the left of track image in the browser


To add native genomic annotation tracks
parameter namegftk
valuetrackname1,mode1,trackname2,mode2,...
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19&gftk=refGene,full,gc5Base,show
notecurrently, native long-range genome interaction tracks are also declared with this parameter
the mode can never be "hide"
if the track is quantitative, the mode can only be "show",
if the track is positional (bed), the mode can be "thin", "full", "density"
if the track is long-range interaction, mode can be "arc", "trihm", "thin", "full", or "density"


To add native metadata term
parameter namemetadata
valuemetadata terms joined by comma, space in term names must be coded as %20
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19&metadata=Histone%20Mark,12003
notenone-leaf terms are shown as "words"
leaf terms must be used as internal ID
If you want to create custom metadata terms, use JSON datahub


To add custom bedGraph track
parameter namecustombedgraph
valuename1,url1,name2,url2,...
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19&custombedgraph=track%20No.1,http://vizhub.wustl.edu/hubSample/hg19/qual1.gz,track%20No.2,http://vizhub.wustl.edu/hubSample/hg19/qual2.gz
notespecial characters in the names must be escaped


To add custom bigWig track
parameter namecustombigwig
valuename1,url1,name2,url2,...
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19&custombigwig=track%20No.1,http://vizhub.wustl.edu/hubSample/hg19/sample.bigWig,track%20No.2,http://vizhub.wustl.edu/hubSample/hg19/GSM429321.bigWig
notespecial characters in the names must be escaped


To add custom bed track (positional annotations)
parameter namecustombed
valuename1,url1,mode1,name2,url2,mode2,...
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19&custombed=bedTrack%20No.1,http://vizhub.wustl.edu/hubSample/hg19/bed.gz,full,bedTrack%20No.2,http://vizhub.wustl.edu/hubSample/hg19/mer41b.gz,full
notemode must be one of "thin", "full", "density"
special characters in the names must be escaped


To add custom long-range interaction track
parameter namecustomlongrange
valuename1,url1,mode1,name2,url2,mode2,....
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19&customlongrange=trackname,http://vizhub.wustl.edu/hubSample/hg19/K562POL2.gz,arc
notemode can be one of "arc", "trihm", "thin", "full", "density"
special characters in the names must be escaped


To add custom read-alignment (BAM) track
parameter namecustombam
valuename1,url1,mode1,name2,url2,mode2,...
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19&custombam=tempest,http://vizhub.wustl.edu/hubSample/hg19/bam1.bam,density
notemode can be one of "thin", "full", "density"
special characters in the names must be escaped


To show secondary panel (experimental)
parameter namesplinters
valuecoordinate (e.g. chr5:5000000-5100000)
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg19&gftk=gc5Base,show&splinters=chr5:5000000-5100000
notemultiple coordinates can be added, join them by comma. Only use this if your screen is wide enough...


To show default tracks and contents for a genome assembly

parameter namedefaultContent
valuestring "on"
examplehttp://epigenomegateway.wustl.edu/browser/?genome=hg18&defaultContent=on
note



v13: enhanced user interface, architectural overhaul

Happy Holidays to you all! We're very happy to announce a brand new WashU Epigenome Browser. Our goal is to make the User Interface rich in functionality, while keeping it as simple as possible.

Get the code here: from our server, or dropbox
v13 .tgz file does not include folders "images" and "manual", which are available through past version .tgz files


To use the URL parameter function, see instructions in this new post.

To use the datahub function, see instructions in this updated post.


Docs that are not available for the moment:
  1. new interface of scaffold sequence management, Gene Set View (management)
  2. simplified interface of Genome Snapshot (aka Bird's Eye View)
  3. new function: split panel display (for browsing long-range interaction data sets)
  4. improved Circlet View interface



Following screenshot is the part of Browser UI that you often see.



Old Browser is here http://epgg-test.wustl.edu/browser/ in case you need to access your saved sessions. We will keep it alive for two months and alert you before we take it down.



Following is a quick overview of some new features available on the new Browser:


Toolbox as the "single entry point"

Please find the row of three buttons in the middle of toolbox: "Tracks", "CustomTK", and "Apps". This is the entry point to a majority of the Browser's functions.

Tracks refer to the collection of tracks that are available through the Browser service. Clicking on it shows a small menu which leads to the selection panel for each category of tracks:


CustomTK links to the panel for custom track submission and management. Number in the parenthesis is the # of custom tracks you've submitted.

Apps links to the list of applications.



Genome navigation

On top of Browser panel a blue button shows the current coordinate. Click this button to show options to relocate:


Or as indicated, click the button in this small panel to visually select a new region from one of the chromosomes:




Agile track handling
The track labels are always shown on the left of the track images, mouse over the label for complete track name and more info (e.g. color scale bar for the heatmap quantitative tracks)


Press your mouse on the track header and drag it up or down, the track will be rearranged accordingly. You can even drag the heatmap track outside of genome heatmap, or bring a track into the genome heatmap from outside. You're in full control now:




Handy sticky notes
A small but perhaps very useful new feature is the coordinate-anchored sticky notes. It's incredibly easy to use. Just right click on the chromosome ideogram below the genome heatmap, and an input panel appears:


Once a note is made, it is attached to a point in the genome and is indicated by a "note" icon. You can save notes in Session so they are persistent, give it a try!







A few things are still missing here, but we're working to bring these functions back as soon as possible:
  • Pairwise comparison and hypothesis test over quantitative track data
  • Querying and retrieval of KEGG pathway
  • SVG output for Genome Snapshot and Circlet view (previously known as Henge View)

Wednesday, November 7, 2012

v12 minor release: S. pombe genome, bug fixes

Get source code from dropbox folder.


S. pombe genome


Version 12 code release supports the genome of Schizosaccharomyces pombe, the fission yeast. It is the first fungi species supported by WashU Genome Browser.


The S. pombe genome assembly was released on March 2012 and was downloaded from PomBase. You're welcome to suggest additional annotation/public experimental assay data sets to be added to this genome.

Procedures to prepare the S. pombe genome database can be viewed here.


Bug fix
A major bug that prevents session from been restored when it's about juxtaposition with custom bed track is now fixed.

An error that prevents from parsing URL parameters was fixed. You can now append session information to a URL in the format of:

http://epigenomegateway.wustl.edu/browser/?genome=[ASSEMBLY]&session=[SESSION]

Where [ASSEMBLY] is name of genome assembly (e.g. hg19), and [SESSION] is session ID. By composing such an URL you can easily share saved sessions with others.

Additionally, you can append status ID to the URL: &statusid=[STATID]



Tuesday, September 25, 2012

v11: search track by GEO accession, and bug fixes

Get source code from dropbox folder.

Version 11 code release provides new feather that allows you to search for tracks with a list of GEO accession numbers, and a few bug fixes.


Find track by GEO accession
Follow these steps to see this new feature in action.

Click this link to open up the Browser with human genome hg19 but no data:

http://epigenomegateway.wustl.edu/browser/?genome=hg19

At navigation bar click tab "Genome heatmap":


Inside the panel, click tab "Find by GEO":


Enter GEO accessions. You can use a list of them here, one accession per line:


GSM469970
GSM521901
GSM521895
GSM521897
GSM521909
GSM521913
GSM469968
GSM521889
GSM608165
GSM945297
GSM788085
GSM733692
GSM607494
GSM733776
GSM945228

This function only accepts Sample accessions (names start with GSM), but not Series (names start with GSE), or Platforms (names start with GPL).

Press button "Search", a short moment later the search results appear:


Tracks are listed in the table on the right. You can press the button "Display all" to have all of them displayed.



Otherwise, you can selectively display a few by clicking on the track labels, and the selected tracks will be added to a table in the toolbox panel. After you finish selection, press button there to add them.

And thank Rebecca Lowdon for suggesting this feature!


Bug fix
A bug associated with displaying bigwig tracks as "wreath tracks" for the Henge View is corrected.

A bug affecting the parsing of mismatching base pairs in the SAM track is fixed.

Sunday, September 23, 2012

v10 minor update: bug fix and more scrolling options

Download source code from dropbox.

This code release fixed a bug associated with SAM track display. When displaying read alignment data during genomic juxtaposition mode, the track image appeared to be shifted due to the bug, now it's fixed.

Besides, you can now drag on any of the genome annotation tracks to scroll.

Try this link to open the browser and show the tracks as displayed in the screen shot below:

http://epigenomegateway.wustl.edu/browser/?genome=hg19&juxtapose=LTR&gftk=LTR,full&coordinate=chr1:1340000-1390000&custombam=stat1hela,http://vizhub.wustl.edu/hubSample/hg19/sam1.gz,thin


As illustrated in this example, genomic juxtaposition focuses the view on LTR elements and reveals that STAT1 ChIP-Seq reads bind one LTR copy, suggesting this LTR copy has something to do with STAT1's business in HeLa cells.

STAT1 ChIP-Seq data comes from this publication: Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing

Sunday, September 16, 2012

v9 major update: continued bigWig support, new custom track UI, bug fix

Access source code from our public Dropbox folder.


bigWig format is continue to be supported

In this code release we continue to support bigWig file format. Now user has more options for visualizing quantitative data: either via bedGraph files compressed by tabix, or bigWig files.

The main advantage of bigWig over bedGraph files is that bigWig performs much faster at high-level, low-resolution browsing.

That is when you look at data at very large genomic intervals or whole chromosomes, especially the case of Bird's Eye View.

But when browsing at finer scale, we don't observe any performance difference between the two data formats.

However, saved sessions prior to this change have gone void due to this change. We're sorry for this inconvenience.


New user interface for custom track function

Click a tab belonging to one type of custom track to see the submission panel.


Showing the panel for bedGraph tracks. Click "GO BACK" on top to slid back.







Bug fix


A bug associated with Gene Plot is fixed. Now it correctly handles genes with only 1 exon.

A bug with getting chromosome sequence during Gene Set View is fixed. When the Browser is running Gene Set View at very fine zoom level, the chromosome sequence can be correctly shown.

Saturday, September 15, 2012

Prepare custom long-range interaction track

A sample script for converting certain UCSC ChIA-Pet track files into WashU Browser track format is now available at http://epigenomegateway.wustl.edu/browser/script/, with name "makeTrack_from_ucscChiapet.py".

To use this script, first download a ChIA-Pet track file from UCSC/ENCODE public file directory: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeGisChiaPet/

Here we use "wgEncodeGisChiaPetHct116Pol2InteractionsRep1.bed.gz" as input for the script. Only files ending with ".bed.gz" or in similar format can be processed by this script.

Make sure you have bedSort, bgzip and tabix programs installed on your computer. On a linux computer, run these commands:

gunzip wgEncodeGisChiaPetHct116Pol2InteractionsRep1.bed.gz

python makeTrack_from_ucscChiapet.py wgEncodeGisChiaPetHct116Pol2InteractionsRep1.bed abcd

After these two steps, 2 files will be generated: "abcd.gz", and "abcd.gz.tbi". Follow step 4 below to display this track via the custom track mechanism.

You are likely required to make small modifications to this script so it can process your data with a different format.




0Make sure you have the tabix program installed.
You can download the latest source and compile:
http://sourceforge.net/projects/samtools/files/tabix/

Or if you're using Ubuntu operating system, install it using apt-get:
$ apt-get install tabix
You should have both tabix and bgzip programs available on your computer.

1
Make a text file for your long-range interaction data with following columns:
  1. chromosome name
  2. start coordinate
  3. stop coordinate
  4. information about the interacting region (e.g. chrX:123-456,3.14, where "chrX:123-456" is the coordinate of the mate, and "3.14" is the score of the interaction)
  5. ID (unique non-negative integer)
  6. relative direction of the interacting region
Be sure to make TWO records for a pair of interacting loci, one record for each locus.


As an example, interval "chr1:111-222" interacts with interval "chr2:333-444" on a score of 55, we will use following two lines to represent this interaction:

chr1   \t   111   \t   222   \t   chr2:333-444,55   \t   1   \t   .
chr2   \t   333   \t   444   \t   chr1:111-222,55   \t   2   \t   .


2Compress the text file:
$ bgzip interaction.txt

The old file is gone and a new file "interaction.txt.gz" is there instead.

3Build tabix index of the compressed file:
$ tabix -p bed interaction.txt.gz

The "interaction.txt.gz" is untouched but an index file "interaction.txt.gz.tbi" is generated.

4Display this file as a custom long-range interaction track on WashU Genome Browser.
Place both files ".gz" and ".gz.tbi" on the SAME directory on your web server.
Use only the URL to the .gz file to make the custom track.

Sunday, September 9, 2012

v8 major update - tabix indexing, chromHMM tracks

The WashU Genome Browser has been on a fast track of change. Today we announce yet another major update.


tabix for file indexing/querying
The Browser is now using tabix to store, index and query the track files (http://samtools.sourceforge.net/tabix.shtml).

Tabix is a peer of UCSC's bigWig/bigBed system, except it is much more generic and simpler.

Users using the custom tracks need to migrate their data. Please refer to following posts on how to convert UCSC formats to tabix format:

bigWig to tabix
bigBed to tabix
BAM to tabix




chromatin state tracks (categorical data)

Broad chromatin state data (http://compbio.mit.edu/ChromHMM/) on 9 human cell lines are now displayed. Data is obtained from ENCODE project. To see all of them, click the link below:

http://epigenomegateway.wustl.edu/browser/?genome=hg19&hmtk=wgEncodeBroadHmmGm12878HMM,wgEncodeBroadHmmH1hescHMM,wgEncodeBroadHmmHepg2HMM,wgEncodeBroadHmmHmecHMM,wgEncodeBroadHmmHsmmHMM,wgEncodeBroadHmmHuvecHMM,wgEncodeBroadHmmK562HMM,wgEncodeBroadHmmNhekHMM,wgEncodeBroadHmmNhlfHMM&metadata=Sample


The chromHMM tracks are underlain by a new type of track -- the track with categorical data. This type of tracks are displayed in genome heatmap along all the genome-wide quantitative assay results, but show data of categorical nature, e.g. different chromatin states.


When you invoke the configuration options on the chromatin state tracks, you will see quite different options compared with quantitative tracks. That is, the complete list of "states" or "categories", and controls to change the color of each state.

This new feature is still under development. We're working to make the custom track support on this new track type.

Generate tabix file from SAM/BAM file

 Update 6/1/2013: SAM format is no longer supported, please use BAM format files instead.

0 Make sure you have the tabix program installed.
You can download the latest source and compile:
http://sourceforge.net/projects/samtools/files/tabix/

Or if you're using Ubuntu operating system, install it using apt-get:
$ apt-get install tabix
You should have both tabix and bgzip programs available on your computer.

1
Skip this step if you have a SAM file.

Convert the BAM file to SAM file using samtools:
$ samtools view input.bam > input.sam

2 Compress the SAM file:
$ bgzip input.sam

The old file is gone and a new file "input.sam.gz" is there instead.

3 Build tabix index of the compressed SAM file:
$ tabix -p sam input.sam.gz

The "input.sam.gz" is untouched but an index file "input.sam.gz.tbi" is generated.

4 Display this file as a custom SAM track on WashU Genome Browser.
Put the .gz and .gz.tbi files on the SAME directory on your web server.
Use only the URL to the .gz file to make the custom track.

Prepare custom track of annotation data (or "bed" track)

0 Make sure you have the tabix program installed.
You can download the latest source and compile:
http://sourceforge.net/projects/samtools/files/tabix/

Or if you're using Ubuntu operating system, install it using apt-get:
$ apt-get install tabix
You should have both tabix and bgzip programs available on your computer.

1 Skip this step if your file is BED format.

Run the command bigBedToBed in UCSC genome browser tool set and convert the bigBed file to a bed text file.

2 Compress the BED file:
$ bgzip input.bed

The old file is gone and a new file "input.bed.gz" is there instead.

3 Build tabix index of the compressed BED file:
$ tabix -p bed input.bed.gz

The "input.bed.gz" is untouched but an index file "input.bed.gz.tbi" is generated.

4 Display this file as a custom bed track on WashU Genome Browser.
Put the .gz and .gz.tbi files on the SAME directory on your web server.
Use only the URL to the .gz file to make the custom track.

The BED format used by WashU Epigenome Browser:
  1. chromosome name
  2. start coordinate
  3. stop coordinate
  4. Name (if absent, use dot)
  5. ID (unique non-negative integer)
  6. Strand (+/-/.)

Prepare custom track of numerical data

0 Make sure you have the tabix program installed.
You can download the latest source and compile:
http://sourceforge.net/projects/samtools/files/tabix/

Or if you're using Ubuntu operating system, install it using apt-get:
$ apt-get install tabix
You should have both tabix and bgzip programs available on your computer.

1 Skip this step if your file is bedGraph format.

For bigWig files...
Rrun the command bigWigToBedGraph in UCSC genome browser tool set and convert the bigWig file to a bedgraph text file.

For wiggle files...
Convert them into bedgraph text files. Only a few lines of code is needed for this task.
Or if you really hate coding, you can convert the wiggle file to bigWig format using wigToBigWig (also from UCSC genome browser tool set), then do bigWigToBedGraph.

2 Compress the bedgraph text file:
$ bgzip input.bedgraph

The old file is gone and a new file "input.bedgraph.gz" is there instead.

3 Build tabix index of the compressed bedgraph file
$ tabix -p bed input.bedgraph.gz

The "input.bedgraph.gz" is untouched but an index file "input.bedgraph.gz.tbi" is generated.

4 Display this file as a custom bedgraph track on WashU Genome Browser.
Put the .gz and .gz.tbi files on the SAME directory on your web server.
Use only the URL to the .gz file to make the custom track.

Monday, August 20, 2012

v7 major update - Long Range interaction view, SVG output

It's been quite a while since our last post in June. We've been busy preparing some big updates on our dear Epigenome Browser, and this morning we're very happy to deploy them on our public server so all of our dear users could benefit.

Long range genome interaction data visualization
The "long range genome interaction" data sets, as those generated by 5C, Hi-C, and ChIA-PET, capture 3-dimensional chromatin arrangement in cell nucleus. This type of data is very promising in improving our interpretation of high-throughput genomics assay results (e.g. TF ChIP-Seq, histone mark ChIP-Seq), and more importantly understanding how eukaryotic genomes function as nonlinear systems.

Now you can explore the long-range interaction data sets using our Browser! Just click the following link to see it in action:

http://epigenomegateway.wustl.edu/browser/?genome=hg19&coordinate=chr7:26663835-28123541&gftk=IMR90_40kb_hindIII_combined,trihm,wgEncodeGisChiaPetK562CtcfInteractionsRep1,arc,refGene,full&hmtkmetadata=Histone%20Mark,11310,14018

This humongous URL yields a view with two long range interaction tracks and a set of histone mark tracks. With a few twists you can achieve a view like the following screen shot:



The triangular heatmap shows data from a Hi-C assay on IMR90 cells (Dixon JR, Bing Ren,Nature 485(7398):376-80).

The arcs shows data from a ChIA-PET assay on K562 cells (ENCODE Project data).

The bunch of histone mark tracks are active/repressive histone marks on IMR90 and K562 alike. Most of them are obtained from ENCODE project.

The region on chr7 contains the HOXA gene cluster.

We are busy preparing a manuscript for this exciting new update, and the above example constitutes part of a figure in the manuscript.

We invite our users to explore this function (tell me any bugs, xzhou82 AT gmail). Some helpful tips are available in a short chapter from the user manual: http://epigenomegateway.wustl.edu/browser/manual/#c-c

Presently, 44 long-range assay tracks for human, and 14 for mouse are available through our Browser, all of which are public data sets. New ones will be added whenever they are available.


SVG output
After so many earnest requests, the high-quality Browser Shot output is finally implemented. To do it, click "Apps" button in the small toolbox and select SVG option:


The generated SVG file can be viewed in your web browser. The file is usually large so please be patient while it's been transferred. Firefox and Chromium show no problem at displaying such files. With Chromimium you can print the SVG file to a PDF file.

The SVG file output function is totally different with UCSC's PDF/PS output function. Everything is done on client-side, no server-side rendering is involved.

This function is still a prototype. Let me know if it breaks. I'm still working to enable SVG outputs for the Bird's Eye View and the Companion panel from the long-range track view.

We sincerely recommend the Chromium web browser (http://www.chromium.org/Home), it is absolutely free, and available for all platforms. Epigenome Browser loves it!

Monday, June 4, 2012

v6 release - major bug fix

This code release comes with major bug fix. Please go to the source code archive on our server or Dropbox to download.


Bug fix
When the Browser was showing data at basepair level, the use of coordinate was wrong at multiple places:

  1. requesting data for the genomic region exposed by panning
  2. placing the blue square box indicator on the small chromosome ideogram on the top left of the page
  3. on obtaining genomic sequence for the region under view
Now these problems that were visible to us have been fixed. If you notice any more, comment on our blog and let us know!



Minor improvements
Y axis scale can now be drawn in genome heatmap:


To show the scale, increase track height to 20 or more pixels and the scale will draw automatically on the left side of the canvas.

And here's a slight improvement on the control panel. At "Heatmap track" panel, contents for "Data sets" and "Configure track selection panel" have been re-styled as "tab-page". Click any of the two buttons at top of panel to see:




This tab-page style is also applied to Bird's eye view panel.

Finally, we start depositing our source code archives into a dropbox folder which is publicly accessible. I guess this serves a nice fallback in case tornado strikes our server room and demolishes the servers. The user manual however is not included in the archive anymore to shrink size as there's only a scant 3GB space with my account so it needs to be lean (besides Xin hasn't updated his user manual since stone age).

Saturday, June 2, 2012

v5 release - gene indicating function

Version 5 of Wash U Browser is available now. Follow this link and obtain the source code: http://epigenomegateway.wustl.edu/source/


New features
Sometimes the genes can only have parts but not the whole displayed in genome browser. For example when viewing a short interval centering over transcription start sites of genes in Gene Set View:



Genes in the gene track are partially displayed in above view. As a result it would be difficult for user to view the gene's entire structure unless he/she quits Gene Set View and relocates to this gene.

Now this problem is solved with the gene indication function. You only need to click on this gene to invoke the tooltip balloon and it will show you:


On the top of balloon, a small graph is drawn to display the gene in its entirety. As usual, the thin lines indicate introns, thick blocks are exons, and the smaller blocks at each end are untranslated regions (UTRs). A yellow box marks the part of the gene that's currently visible in the gene track.

Size of this graph is constant, it won't change with the actual length of the gene or balloon size. And the graph won't show up for genes completely displayed in the track. Currently there's no clickable function attached to this graph, but I wonder if user will find it useful to be able to control the plotting color by clicking it? Anything else?

Like I promised, the gene tooltip balloon is turning into a fully loaded dashboard. More interesting and useful functions will be added here to let you know, recognize, study, and operate the gene under your focus. Hang on with us!



Bug fixes
  1. Imprecise gene placement in the gene track during gene set view is fixed
  2. Imprecise indicator (blue rectangle) placement over small chromosome ideogram on top left corner of the page is fixed.


Tuesday, May 29, 2012

v4 code release: faster gene track rendering

With the Start of Summer, we release our 4th version of Wash U Epigenome Browser. Following is a summary.


Major improvement
The Browser response is made faster by the new gene track. To render a gene track, only one Ajax query is needed to fetch the gene data. In the past two Ajax queries were needed and dragged down performance. A lot of coding work has been done for this, basically replacing the old design with the new one. Clear-looking arrow marks indicate strand of the gene. The marks are drawn both over introns and exons:



Minor improvements and changes

To show more information about a gene, click on it to invoke the tooltip:


In the balloon, gene symbol is printed in large italic font at top left. On its right is the "internal identifier" related to this gene track, and the link to the entry of this gene in an external database from which the gene info are retrieved.

In the middle of the balloon, click "show gene structure" to reveal the data on this gene:


We are going to add the Gene Ontology annotation for all the species, and that will be represented in the same style as gene structure data here. The small static tooltip balloon is turning into a dashboard with rich information.

Also at control panel, some more functions are added to allow user to more conveniently control appearance of genomic feature tracks. Go to "Tracks" > "Genomic features" to see the contents:


This table shows your collection of genomic feature tracks. You all know what the wrench icon does. Click it to configure the track's rendering through the panel displayed in floating toolbox. A few color boxes are also available on the right, which are shortcuts to the color configuration function. Click them will invoke the color palette:


❖, ☁, and T are for "bigBed" tracks (e.g. genes or repeats), controlling color of "item box", "density plots", and "name text" respectively. The numerical tracks like "vertebrate PhyloP" has different shortcuts, "+" and "-" controls plotting color of positive/negative values. Hope you'll like them!





*** notice to Maize genome users ***

We have changed the chromosome names from "n" to "chrn", where "n" is one of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, Mt, Pt, UNKNOWN. We previously follows what maizesequence.org is using on their site, but we were persuaded to change into the "conventional style" (chr1 instead of 1). It is important to be aware of it as you need to prepare your bigWig/bigBed/BAM files accordingly so that they can be properly displayed on the browser.



Bug fix
Small coding error leading to erroneous display of some aradidopsis and maize genes is fixed. Now the genes are displaying alright:



Monday, May 21, 2012

v3 code release - Maize genome supported

Version 3 of the Wash U Epigenome Browser source code is available now! Follow this link to download. Following is an account of what's new.


New genome assembly
We're very happy to support the maize genome (from the B73 strain, assembly version 2).

 (from Wikipedia)


Corn is nutritious and makes healthy food. While in industrial countries, maize is grown to feed the livestock and produce biofuel, maize researchers all know how important maize is as a staple food for people in many parts of the world.

So we are now supporting maize genome in our Browser with the hope that maize researcher and breeders can benefit from browsing maize genomic data using our service.

Data including genome sequence, gene and repeat prediction was downloaded from http://www.maizesequence.org/index.html. Apart from a few genome annotation tracks, the maize genome database is currently empty. We will certainly add public maize data sets here (let us know what you would like to see), but you can always view your own data sets via the custom track and Data Hub functions. Click following link to open the browser showing maize genome and tracks from a sample data hub:

http://epigenomegateway.wustl.edu/browser/?genome=AGPv2&coordinate=1:11499583-11999166&datahub=http://epigenomegateway.wustl.edu/browser/b73/testhub.txt&gftk=AGPv2_5a,full

You can find a sketch of the procedures for making maize database here.


Bug fixes
  1. At the heatmap track facet browsing panel, clicking the ⊞ or ⊟ will correctly open or fold the contents (it used to generated an error).
  2. When the correlation function is in use, any heatmap tracks newly added with have their correlation coefficients properly computed and displayed.
  3. Occasional error encountered when clicking some items in gene track is now eliminated.


Minor improvements
To make it easier to configure metadata color map, the button  is added at right side of metadata color map:

Clicking this button will display a panel containing the metadata vocabulary. You can browse through the vocabulary and add/remove terms from the color map by toggling the checkboxes:


Sunday, May 20, 2012

Variant of BED file format (and how to make it)

UPDATE contents in this post have gone void as WashU Epigenome Browser no longer supports bigBed file format.

Look at this post to see how to reformat your data into the tabix format.

-----------------------------------

In Wash U Epigenome Browser, we use a slightly altered version of BED format to encode positional data of genomic features: the 5th field is set to an unique integer, to be used as the ID of the genomic feature represented by that line. There's no upper bound of the ID value, it can go as high as 10 million if there's 10 million lines in that BED file. (example)

But the day breaks. Right now users diligently following our guideline to prepare a custom genomic feature track will encounter following error when converting BED file into bigBed file with the bedToBigBed program:


At line xx, score (xxx) must be between 0 and 1000


... where "xxx" is an integer bigger than 1000.


It is all because in the BED format specification the 5th field is deemed as "score", and the value must be between 0 and 1000. The bedToBigBed program scrutinizes the input BED file and squawks when it sees a "score" bigger than 1000.

In order to work around the hurdle to generate properly working bigBed files, you can use following bedToBigBed binary to do the work, but not the native one:

http://epigenomegateway.wustl.edu/bedToBigBed

This binary is compiled on a PC with 32bit Ubuntu operating system, using Kent Source Tree downloaded on Apr 25, 2012. It should work on both 32bit and 64bit Linux PCs.


Follow is the recipe to re-make bedToBigBed program that doesn't squawk:

  1. Download Kent Source Tree at http://hgdownload.cse.ucsc.edu/admin/jksrc.zip, decompress it, the directory "kent" will be created in your working directory.
  2. Open file "kent/src/lib/basicBed.c"
  3. At line 1375, if the content is "if (!isCt && (bed->score < 0 || bed->score > 1000))", remove line 1375 and 1376. Else do nothing.
  4. Save your edit on this file.
  5. Resume normal procedure to build the library and bedToBigBed binary.
    1. Remove "-Werror" tag from file kent/src/inc/common.mk
    2. Go to kent/src/
    3. Run "make libs"
    4. Go to kent/src/utils/bedToBigBed/
    5. Run "make", then a new "bedToBigBed" binary will be generated


We have to stick to this variation of the BED format because the genomic feature track need the ID field to scroll (ID is a neat way for the Browser to tell which genomic features have been extended by scrolling so the new data can be correctly appended to cached data). We don't think it's bizarre, savage, or ruthless, because the 5th field of BED file is already of integer type, so why not making it free of limit, free to bear an arbitrary value it wants to? We apologize for any unsettlement that might arise, and we're happy to hear your thoughts.

Free(dom) is good, isn't it?

Sunday, May 6, 2012

v2 code release - URL parsing and persistent hyperlinks

UPDATE December 27th, 2012:
URL parameter function described in this post has gone void as new specifications are in use since Version 13, see this post for details:
http://washugb.blogspot.com/2012/12/url-parameter-specification-effective.html




Today we're very happy to release our second version of Wash U Epigenome Browser. Go to source code archive page and download subtleKnife.v2.tgz to get the code.


New features
You can now supply parameters through URL as a way to control browser behavior, just like good-old CGI programs do.

Very quickly, click following link and see how it works:

http://epigenomegateway.wustl.edu/browser/?genome=hg19&datahub=http://remc.wustl.edu/xzhou/hub.txt

Once clicked, the browser will be displayed showing human hg19 genome, and tracks from the sample data hub.

Expected use of this feature is for users to obtain persistent hyperlinks to current browsing status, and share it to collaborators. Also external web sites can provide links and button to direct users to visit our browser with customized display information (e.g. a specific composition of custom tracks hosted on external websites).

Indeed, URL with parameter is an alternative to session function. The URL could be truly persistent, as it doesn't require any kind of storage to keep it valid. But with session, its information must be kept in the database, and thus could be deleted and become invalid. Though using URL user might has to deal with really long URL strings, but with "session" parameter, the URL can be very short and handy.

The composition of such URL is:

[base URL] + '?' + [key1] + '=' + [value1] + '&' + [key2] + '=' + [value2] + '&' + [more key/values pairs]

Explanations:
  1. Base URL is http://epigenomegateway.wustl.edu/browser/.
  2. '?' the question mark must be present and immediately follows base URL.
  3. Key and value are joined by '=', and multiple key/value pairs are joined by '&'.
Following is Key/Value specification. All keys are case-insensitive:
  • "genome", the name of genome to display, allowed values are:
    • hg19 (human)
    • mm9 (mouse)
    • danRer7 (zebrafish)
    • dm3 (fruitfly)
    • tair10 (arabidopsis)
  • "session", to restore a saved session. Value is session id string
  • "statusid" value is status ID, this parameter is used in conjugation with "session"
  • "metadata", to decide which metadata terms are to be displayed in metadata color map. Value is comma joined metadata term names (experimental, only leaf terms have been tested to work in this way)
  • "coordinate", to decide specific genomic position the browser should be displaying. Value is coordinate string in form of chr1:5000-6000
  • "juxtapose", to run juxtaposition on a bigBed track. Value has two possibilities:
    • if supplying URL, the URL must points to a valid bigBed file. Additional parameter "juxtaposecustom=on" must be supplied as well.
    • if using native genomic feature track to run juxtaposition, valid track name must be provided (experimental, the list of bigBed tracks is not explicitly given but can be found in config/hg19/makeDb.sql file)
  • "geneset", to run Gene Set View. Value is comma separated gene names. This parameter should not be used with "coordinate" or "juxtapose".
  • "datahub", to display tracks from a data hub. Value is URL to a data hub descriptor file
  • "hmtk", to display specific native heatmap tracks. Value is comma separated native heatmap track names (experimental, the list of native heatmap tracks is not explicitly given but can be found in config/hg19/track2Detail file)
  • "customhmtk", to display custom heatmap tracks. Value is in form of "name1,url1,name2,url2,...", where name/url is to define one custom heatmap track. All fields are joined by comma, so track name must not contain comma.
  • "gftk", to display specific native genomic feature tracks. Value is in form of "name1,mode1,name2,mode2,...", where name/mode is to define display of one native genomic feature track. Mode must be one of thin/full/density (experimental)
  • "customgftk", to display custom genomic feature tracks. Value is in form of "name1,url1,mode1,name2,url2,mode2,...". It has almost same requirement as above.
  • "bam", to display BAM tracks hosted on our server. This function is experimental because we are still working to align and generate BAM files for every heatmap tracks.
  • "custombam" to display custom BAM tracks, value is in form of "name1,url1,mode1,name2,url2,mode2,...". It has identical requirement as "customgftk".

In next code release we expect the browser to be able to automatically generate URL as a snapshot of current browsing status.


Bug fix

  1. The situation of no term displayed in metadata color map now works with session saving/restore.
  2. The browser now tolerates empty lines in data hub descriptor file.
  3. When no track is displayed in genome heatmap, clicking "add track" button in pairwise comparison panel won't encounter error. A warning message will be printed out in message console complaining there's no track to choose from.



Minor changes
In the track selection panel, custom genomic feature tracks and BAM tracks are presented in following new way:


Click the new tabs "Genomic features / custom" and "Read alignments / custom" to see tracks of these types.

In Bird's Eye View panel, left click the wiggle plot image to show a small context menu (but not right click):