Home / Grid tools & services / Data access & analysis

Data access using AMI

Page under construction

Important links

See also

Answers to user questions; these may eventually move to a FAQ.

Introduction

AMI is the general Atlas Metadata Interface and dataset selection interface. Any database, which provides descriptors of its content in the AMI-compliant format, can be accessed by AMI. One special application is the browsing and search of datasets, which provides - among others - the following services:

An excellent tutorial introduces the user step-by-step to the system. The most frequently used functionalities are listed below in short form. You should register with AMI, preferably with your Grid certificate, such that the system can authorize you automatically. AMI provides for each registered user a personalized AMI-user home page, which lists your AMI-bookmarks (different from your browser's bookmarks), e.g. URLs of displays of dataset selections, which you previously made. We distinguish below between the AMI-home page and the AMI user-home page.

There are still a few rough edges, please report them to atlas-bookkeeping. The working of links will be reviewed and improved; you may find that the browser's back button does not work, or clicking on the right-mouse button may lead to an unexpected page. For the time being avoid using the right-mouse button. Occasionally the response of the AMI server is slow; in this case try again.

AMI dataset search services

The following sections highlight some of the functionality of the AMI dataset search. They do not replace the tutorial, but are intended to provide a quick overview and invitation to explore the system in more detail.

1 - Preparation

User registration
Register
						  > AMI-home > menu Applications > ATLAS > Dataset Selection 
						  > menu Tools > Register User   
						   Register your Grid certificate (recommended) 
 						  > menu Tools > Register Certificate 
				  
Access your AMI-home page
 On most pages  
						 >  top navigation bar > "user symbol" + Home 
						  Alternatively start from AMI-home   
						 > AMI-home > menu Applications > ATLAS > Dataset Selection 
				 
Note: AMI-Home and user-Home are presented with different symbols. You may want to bookmark your user-Home page in your browser.

2 - Quick search

This is the standard way of searching and investigating datasets. It provides a quick overview of existing datasets and leads to more detailed information.

Display the table of selected datasets
 > user-Home > menu Datasets > Simple Search > choose button "dataset name search" 
					Example 1 :  > in the field enter  csc%Zee% 
					Example 2 :  > in the field enter  %csc%Zee%AOD% 
				
Example 1 lists two project-subprojects, example 2 lists one only. The latter is used below to demonstrate how to explore the information available in a table.
Order the table of datasets
Choose to display all datasets. You may have to click "Full screen" to see all columns.
 > in field "View" enter > 100 > press button "Search" 
Then order the datasets according to prodsysStatus. See the available values by clicking on the "bulb" close to prodsysStatus; close the popup by moving the mouse over "Close" in its right upper corner. Play also with other columns; note that not all descriptions in column physicsSubcategory are sufficiently descriptive to be useful in a selection command.
Meaning of control buttons at right, above the table
Refine Query Allows further selections, see below
Edit Fields Allows to select columns to be displayed, e.g. remove dataType
Export > Ganga / XML/ .. To get a readable display, you may have to inspect the page in the browser's source view
Query > gLite / SQL Shows in a popup the query command used to select the datasets
Meaning of URLs below the logicalDatasetName
DQ2 Link to Panda/BNL page of the dataset,
access to file content, location at Tiers, download command, etc
Ganga export Text file with dataset name
Prodsys Information on task on Grid
Provenance Relationship to other datasets, jobOptions etc, see below
Series Front part of dataset name,
information on trigger, version of simulation, reconstruction

3 - Overview

Overview of catalogues and series
The overview page displays the catalogues and corresponding series, where series is the front part of the dataset name. Note that series names like trig0-calib0-csc11 are constructed by prepending to the original series name csc11 the action at subsequent processing steps. Example
 > user-Home >  menu Datasets >  Overview
						 >  select in column Catalogue csc-production  
						 > in column Series  > choose trig0-calib0-csc11 > click Browse 
				

4 - Advanced search

Advanced search
 > user-Home >  menu Datasets >  Advanced Search
				
The advanced search page is targeted at physics groups. Menus offer choices when standard values are in use. See the bottom field, which offers choices of physics properties, e.g. the eta region or the detectorLayout.

5 - More advanced searches

Refine query
On the page with the dataset search result, just above the table, right side, click Refine Query. This results in a graphical display of the current search term. By clicking on the oval nodes the selection can be modified or more conditions can be added as "AND" or "OR".
Example: start from the table of search results with dataset name %csc%Zee%AOD%.
 > click Refine Query
					 The newly opened page displays the search term:  
					 > click on the node "dataset.amiStatus…"   [whatch the new fields]
					 Add a new search condition, e.g. add an "AND" node:  
					 > select button "AND" > click "Add node"
					 Edit the new node called "Edit node" by clicking on it;  
					 > select dataset.prodsysStatus > click "Add to clause"  
					 > complete the field to read dataset.prodsysStatus LIKE 'EVENTS_AVAILABLE'  
					 > click "Save" > click "Execute Query"  
				
Explore also the second graph, which displays the relationships of the dataset and gives access to more search terms. Click for example on "dataset physics properties" and watch the possible choices.
Note: choices of allowed values or examples of values, should still be added to the display.
Values for search terms
Information about the fields and their values are provided in the Database information. To see e.g. the information about the csc-database:
 > user-Home > Databases > expand "csc" > expand "Production"  
					   To see for example the values for a job filter (the only one existing presently) 
				      > goto jobFilters  > click browse 
				
These displays are partly very technical and not easy to understand, some improvement will certainly become available.

6 - Detailed information on a dataset

Detailed metadata of a data set
Start from the display of the table with dataset search results. Move your mouse over a dataset with prodsysStatus EVENTS_AVAILABLE, click in the left-most column on details. The result is a table of "element's information" with all metadata known to AMI, plus a list of "Children elements" (e.g. jobOptions, event_range) and "associated elements". Clicking on "event_range": production information, like number of events, Grid flavour are displayed. Move your mouse over an entry in the header line, or click on it for an explanation of the content. Click again on the left-most entry "details" to get even more information. Click on "dataset" returns you to the table presentation.
 On table of results 
						 > click details      [left-most column]
						 > click event_range  
						 > click details      [left-most column]
						 > click dataset  
				
Note: Do not use the browser's back button, but the back buttons provided by AMI on top of the page.
Provenance
Provenance displays the relationship of a dataset to other datasets and to various input data files like jobOptions. It extracts from the jobOptions the filter conditions. Start from a table of dataset search results and choose a dataset with prodsysStatus EVENTS_AVAILABLE, click on Provenance. The resulting display shows the datasets, which through the production chain, lead to the current file, which is shown in a box, other relatives are shown in ellipses.
Click on the left-most ellipse, which should correspond to the event generation. If the child element "jobOptions" is available, click on it. In the newly opened page, the jobOption file is displayed, click again on "details" in the left-most column. This extends the table and adds information on the bottom. If available click on the child element "jobFilters". A table of cuts will be shown (some are identical - reason? TBD).
Click on "dataset" to return to the one-line table, which now displays the generation dataset. Repeat the Provenance cycle, to see all datasets, which were derived from the event generation dataset. Summary of steps:
 > click Provenance      	  [below dataset name] 
					   > click left-most bubble     [evgen ancestor] 
					   > click jobOptions >  click details  > click jobFilters   
					   > click dataset              [brings you to the table with the evgen dataset] 
					   > click Provenance           [this time starting from the evgen dataset] 
				

↑ Top