`gcpy`

GCPY initialization script. Imports nested packages for convenience.

Submodules

Package Contents

Classes

`_GlobVars`	Private class _GlobVars contains global data that needs to be
`_GlobVars`	Private class _GlobVars contains global data that needs to be
`vert_grid`
`CSGrid`	Generator for cubed-sphere grid geometries.
`_GlobVars`	Private class _GlobVars contains global data that needs to be

Functions

`convert_lon`(data[, dim, format, neg_dateline])	Convert longitudes from -180..180 to 0..360, or vice-versa.
`get_emissions_varnames`(commonvars[, template])	Will return a list of emissions diagnostic variable names that
`create_display_name`(diagnostic_name)	Converts a diagnostic name to a more easily digestible name
`print_totals`(ref, dev, f[, masks])	Computes and prints Ref and Dev totals (as well as the difference
`get_species_categories`([benchmark_type])	Returns the list of benchmark categories that each species
`archive_species_categories`(dst)	Writes the list of benchmark categories to a YAML file
`add_bookmarks_to_pdf`(pdfname, varlist[, ...])	Adds bookmarks to an existing PDF file.
`add_nested_bookmarks_to_pdf`(pdfname, category, ...[, ...])	Add nested bookmarks to PDF.
`add_missing_variables`(refdata, devdata[, verbose])	Compares two xarray Datasets, "Ref", and "Dev". For each variable
`reshape_MAPL_CS`(da)	Reshapes data if contains dimensions indicate MAPL v1.0.0+ output
`get_diff_of_diffs`(ref, dev)	Generate datasets containing differences between two datasets
`slice_by_lev_and_time`(ds, varname, itime, ilev, flip)	Slice a DataArray by desired time and level.
`rename_and_flip_gchp_rst_vars`(ds)	Transforms a GCHP restart dataset to match GCC names and level convention
`dict_diff`(dict0, dict1)	Function to take the difference of two dict objects.
`compare_varnames`(refdata, devdata[, refonly, devonly, ...])	Finds variables that are common to two xarray Dataset objects.
`compare_stats`(refdata, refstr, devdata, devstr, varname)	Prints out global statistics (array sizes, mean, min, max, sum)
`convert_bpch_names_to_netcdf_names`(ds[, verbose])	Function to convert the non-standard bpch diagnostic names
`get_lumped_species_definitions`()	Returns lumped species definitions from a YAML file.
`archive_lumped_species_definitions`(dst)	Archives lumped species definitions to a YAML file.
`add_lumped_species_to_dataset`(ds[, lspc_dict, ...])	Function to calculate lumped species concentrations and add
`filter_names`(names[, text])	Returns elements in a list that match a given substring.
`divide_dataset_by_dataarray`(ds, dr[, varlist])	Divides variables in an xarray Dataset object by a single DataArray
`get_shape_of_data`(data[, vertical_dim, return_dims])	Convenience routine to return a the shape (and dimensions, if
`get_area_from_dataset`(ds)	Convenience routine to return the area variable (which is
`get_variables_from_dataset`(ds, varlist)	Convenience routine to return multiple selected DataArray
`create_dataarray_of_nan`(name, sizes, coords, attrs[, ...])	Given an xarray DataArray dr, returns a DataArray object with
`check_for_area`(ds[, gcc_area_name, gchp_area_name])	Makes sure that a dataset has a surface area variable contained
`get_filepath`(datadir, col, date[, is_gchp, ...])	Routine to return file path for a given GEOS-Chem "Classic"
`get_filepaths`(datadir, collections, dates[, is_gchp, ...])	Routine to return filepaths for a given GEOS-Chem "Classic"
`extract_pathnames_from_log`(filename[, prefix_filter])	Returns a list of pathnames from a GEOS-Chem log file.
`get_gcc_filepath`(outputdir, collection, day, time)	Routine for getting filepath of GEOS-Chem Classic output
`get_gchp_filepath`(outputdir, collection, day, time)	Routine for getting filepath of GCHP output
`get_nan_mask`(data)	Create a mask with NaN values removed from an input array
`all_zero_or_nan`(ds)	Return whether ds is all zeros, or all nans
`dataset_mean`(ds[, dim, skipna])	Convenience wrapper for taking the mean of an xarray Dataset.
`dataset_reader`(multi_files)	Returns a function to read an xarray Dataset.
`read_config_file`(config_file)	Reads configuration information from a YAML file.
`get_timestamp_string`(date_array)	Convenience function returning the datetime timestamp based on the given input
`add_months`(start_date, n_months)	Args:
`is_full_year`(start_date, end_date)	Verifies if two dates are a full year starting Jan 1.
`adjust_units`(units)	Creates a consistent unit string that will be used in the unit
`convert_kg_to_target_units`(data_kg, target_units, ...)	Converts a data array from kg to one of several types of target units.
`convert_units`(dr, species_name, species_properties, ...)	Converts data stored in an xarray DataArray object from its native
`check_units`(ref_da, dev_da[, enforce_units])	Ensures the units of two xarray DataArrays are the same.
`data_unit_is_mol_per_mol`(da)	Check if the units of an xarray DataArray are mol/mol based on a set
`compute_ste`(globvars)	Computes the strat-trop-exchange, taken as species flux
`print_ste`(globvars, df)	Prints the strat-trop exchange table.
`make_benchmark_ste_table`(devstr, files, year[, dst, ...])	Driver program. Computes and prints strat-trop exchange for
`make_grid_LL`(llres[, in_extent, out_extent])	Creates a lat/lon grid description.
`make_grid_CS`(csres)	Creates a cubed-sphere grid description.
`make_grid_SG`(csres, stretch_factor, target_lon, target_lat)	Creates a stretched-grid grid description.
`get_input_res`(data)	Returns resolution of dataset passed to compare_single_level or compare_zonal_means
`call_make_grid`(res, gridtype[, in_extent, out_extent, ...])	Create a mask with NaN values removed from an input array
`get_grid_extents`(data[, edges])	Get min and max lat and lon from an input GEOS-Chem xarray dataset or grid dict
`get_vert_grid`(dataset[, AP, BP])	Determine vertical grid of input dataset
`make_regridder_L2L`(llres_in, llres_out[, weightsdir, ...])	Create an xESMF regridder between two lat/lon grids
`make_regridder_C2L`(csres_in, llres_out[, weightsdir, ...])	Create an xESMF regridder from a cubed-sphere to lat/lon grid
`make_regridder_S2S`(csres_in, csres_out[, sf_in, ...])	Create an xESMF regridder from a cubed-sphere / stretched-grid grid
`make_regridder_L2S`(llres_in, csres_out[, weightsdir, ...])	Create an xESMF regridder from a lat/lon to a cubed-sphere grid
`create_regridders`(refds, devds[, weightsdir, ...])	Internal function used for creating regridders between two datasets.
`regrid_comparison_data`(data, res, regrid, regridder, ...)	Regrid comparison datasets to cubed-sphere (including stretched-grid) or lat/lon format.
`reformat_dims`(ds, format, towards_common)	Reformat dimensions of a cubed-sphere / stretched-grid grid between different GCHP formats
`sg_hash`(cs_res, stretch_factor, target_lat, target_lon)
`regrid_vertical_datasets`(ref, dev[, ...])	Perform complete vertical regridding of GEOS-Chem datasets to
`regrid_vertical`(src_data_3D, xmat_regrid[, target_levs])	Performs vertical regridding using a sparse regridding matrix
`gen_xmat`(p_edge_from, p_edge_to)	Generates regridding matrix from one vertical grid to another.
`get_vert_grid`(dataset[, AP, BP])	Determine vertical grid of input dataset
`get_pressure_indices`(pedge, pres_range)	Get indices where edge pressure values are within a given pressure range
`pad_pressure_edges`(pedge_ind, max_ind, pmid_len)	Add outer indices to edge pressure index list
`convert_lev_to_pres`(dataset, pmid, pedge[, lev_type])	Convert lev dimension to pressure in a GEOS-Chem dataset
`get_grid_extents`(data[, edges])	Get min and max lat and lon from an input GEOS-Chem xarray dataset or grid dict
`call_make_grid`(res, gridtype[, in_extent, out_extent, ...])	Create a mask with NaN values removed from an input array
`get_input_res`(data)	Returns resolution of dataset passed to compare_single_level or compare_zonal_means
`regrid_comparison_data`(data, res, regrid, regridder, ...)	Regrid comparison datasets to cubed-sphere (including stretched-grid) or lat/lon format.
`create_regridders`(refds, devds[, weightsdir, ...])	Internal function used for creating regridders between two datasets.
`gen_xmat`(p_edge_from, p_edge_to)	Generates regridding matrix from one vertical grid to another.
`regrid_vertical`(src_data_3D, xmat_regrid[, target_levs])	Performs vertical regridding using a sparse regridding matrix
`reshape_MAPL_CS`(da)	Reshapes data if contains dimensions indicate MAPL v1.0.0+ output
`get_diff_of_diffs`(ref, dev)	Generate datasets containing differences between two datasets
`get_nan_mask`(data)	Create a mask with NaN values removed from an input array
`all_zero_or_nan`(ds)	Return whether ds is all zeros, or all nans
`slice_by_lev_and_time`(ds, varname, itime, ilev, flip)	Slice a DataArray by desired time and level.
`compare_varnames`(refdata, devdata[, refonly, devonly, ...])	Finds variables that are common to two xarray Dataset objects.
`check_units`(ref_da, dev_da[, enforce_units])	Ensures the units of two xarray DataArrays are the same.
`data_unit_is_mol_per_mol`(da)	Check if the units of an xarray DataArray are mol/mol based on a set
`six_plot`(subplot, all_zero, all_nan, plot_val, grid, ...)	Plotting function to be called from compare_single_level or
`compare_single_level`(refdata, refstr, devdata, devstr)	Create single-level 3x2 comparison map plots for variables common
`compare_zonal_mean`(refdata, refstr, devdata, devstr[, ...])	Create single-level 3x2 comparison zonal-mean plots for variables
`normalize_colors`(vmin, vmax[, is_difference, ...])	Normalizes a data range to the colormap range used by matplotlib
`single_panel`(plot_vals[, ax, plot_type, grid, ...])	Core plotting routine -- creates a single plot panel.
`combine_dataset`([file_list])	Wrapper for xarray.open_mfdataset, taking into account the
`validate_metrics_collection`(ds)	Determines if a Dataset contains variables for computing
`read_metrics_collection`(files)	Reads data from all "Metrics" collection netCDF files
`total_airmass`(ds)	Computes the total airmass (in both kg and molec).
`global_mean_oh`(ds, airmass_kg, mw_oh_kg)	Computes the global mean OH concentration (1e5 molec cm-3)
`lifetimes_wrt_oh`(ds, airmass_m)	Computes the lifetimes (in years) of CH4 and CH3CCl3 (aka MCF)
`init_common_vars`(ref, refstr, dev, devstr, spcdb_dir)	Returns a dictionary containing various quantities that
`compute_oh_metrics`(common_vars)	Computes the mass-weighted mean OH concentration, CH3CCl3 (aka MCF)
`write_to_file`(f, title, ref, dev, absdiff, pctdiff[, ...])	Internal routine used by print_metrics to write a specific
`print_metrics`(common_vars, dst)	Prints the mass-weighted mean OH (full atmospheric column)
`make_benchmark_oh_metrics`(ref, refstr, dev, devstr[, ...])	Creates a text file containing metrics of global mean OH, MCF lifetime,
`find_mean_oh`(filename)	Searches a GEOS-Chem "Classic" log file for the Mean OH value.
`compute_mean_oh_from_logs`(globvars)	Computes mean OH from GEOS-Chem FullChemBenchmark log files.
`print_mean_oh_from_logs`(globvars, df)	Prints the mean OH table from 1-year FullChemBenchmark log files.
`make_benchmark_oh_from_logs`(reflogdir, refstr, ...[, ...])	Creates the table of mean OH concentrations, as obtained from log files.
`get_shape_of_data`(data[, vertical_dim, return_dims])	Convenience routine to return a the shape (and dimensions, if
`scs_transform`(x, y, s, tx, ty)
`get_troposphere_mask`(ds)	Returns a mask array for picking out the tropospheric grid boxes.
`get_input_res`(data)	Returns resolution of dataset passed to compare_single_level or compare_zonal_means
`call_make_grid`(res, gridtype[, in_extent, out_extent, ...])	Create a mask with NaN values removed from an input array
`get_grid_extents`(data[, edges])	Get min and max lat and lon from an input GEOS-Chem xarray dataset or grid dict
`get_vert_grid`(dataset[, AP, BP])	Determine vertical grid of input dataset
`get_pressure_indices`(pedge, pres_range)	Get indices where edge pressure values are within a given pressure range
`pad_pressure_edges`(pedge_ind, max_ind, pmid_len)	Add outer indices to edge pressure index list
`get_ind_of_pres`(dataset, pres)	Get index of pressure level that contains the requested pressure value.
`convert_lev_to_pres`(dataset, pmid, pedge[, lev_type])	Convert lev dimension to pressure in a GEOS-Chem dataset
`make_grid_LL`(llres[, in_extent, out_extent])	Creates a lat/lon grid description.
`make_grid_CS`(csres)	Creates a cubed-sphere grid description.
`make_grid_SG`(csres, stretch_factor, target_lon, target_lat)	Creates a stretched-grid grid description.
`calc_rectilinear_lon_edge`(lon_stride, center_at_180)	Compute longitude edge vector for a rectilinear grid.
`calc_rectilinear_lat_edge`(lat_stride, half_polar_grid)	Compute latitude edge vector for a rectilinear grid.
`calc_rectilinear_grid_area`(lon_edge, lat_edge)	Compute grid cell areas (in m2) for a rectilinear grid.
`calc_delta_lon`(lon_edge)	Compute grid cell longitude widths from an edge vector.
`csgrid_GMAO`(res[, offset])	Return cubedsphere coordinates with GMAO face orientation
`latlon_to_cartesian`(lon, lat)	Convert latitude/longitude coordinates along the unit sphere to cartesian
`cartesian_to_latlon`(x, y, z[, ret_xyz])	Convert a cartesian coordinate to latitude/longitude coordinates.
`spherical_to_cartesian`(theta, phi[, r])	Convert spherical coordinates in the form (theta, phi[, r]) to
`cartesian_to_spherical`(x, y, z)	Convert cartesian coordinates to spherical in the form
`rotate_sphere_3D`(theta, phi, r, rot_ang[, rot_axis])	Rotate a spherical coordinate in the form (theta, phi[, r])
`get_troposphere_mask`(ds)	Returns a mask array for picking out the tropospheric grid boxes.
`rename_and_flip_gchp_rst_vars`(ds)	Transforms a GCHP restart dataset to match GCC names and level convention
`dict_diff`(dict0, dict1)	Function to take the difference of two dict objects.
`reshape_MAPL_CS`(da)	Reshapes data if contains dimensions indicate MAPL v1.0.0+ output
`total`(globvars, dict_list)	Function to take the difference of two dict objects.
`mass_from_rst`(globvars, ds, tropmask)	Computes global species mass from a restart file.
`annual_average`(globvars, ds, collection, conv_factor)	Computes the annual average of budgets or fluxes.
`annual_average_sources`(globvars)	Computes the annual average of radionuclide sources.
`trop_residence_time`(globvars)	Computes the tropospheric residence time of radionuclides.
`print_budgets`(globvars, data, key)	Prints the trop+strat budget file.
`transport_tracers_budgets`(devstr, devdir, devrstdir, year)	Main program to compute TransportTracersBenchmark budgets
`compare_single_level`(refdata, refstr, devdata, devstr)	Create single-level 3x2 comparison map plots for variables common
`compare_zonal_mean`(refdata, refstr, devdata, devstr[, ...])	Create single-level 3x2 comparison zonal-mean plots for variables
`create_regridders`(refds, devds[, weightsdir, ...])	Internal function used for creating regridders between two datasets.
`get_troposphere_mask`(ds)	Returns a mask array for picking out the tropospheric grid boxes.
`convert_units`(dr, species_name, species_properties, ...)	Converts data stored in an xarray DataArray object from its native
`create_total_emissions_table`(refdata, refstr, devdata, ...)	Creates a table of emissions totals (by sector and by inventory)
`create_global_mass_table`(refdata, refstr, devdata, ...)	Creates a table of global masses for a list of species in contained in
`make_benchmark_conc_plots`(ref, refstr, dev, devstr[, ...])	Creates PDF files containing plots of species concentration
`make_benchmark_emis_plots`(ref, refstr, dev, devstr[, ...])	Creates PDF files containing plots of emissions for model
`make_benchmark_emis_tables`(reflist, refstr, devlist, ...)	Creates a text file containing emission totals by species and
`make_benchmark_jvalue_plots`(ref, refstr, dev, devstr)	Creates PDF files containing plots of J-values for model
`make_benchmark_aod_plots`(ref, refstr, dev, devstr[, ...])	Creates PDF files containing plots of column aerosol optical
`make_benchmark_mass_tables`(ref, refstr, dev, devstr[, ...])	Creates a text file containing global mass totals by species and
`make_benchmark_oh_metrics`(ref, refstr, dev, devstr[, ...])	Creates a text file containing metrics of global mean OH, MCF lifetime,
`make_benchmark_wetdep_plots`(ref, refstr, dev, devstr, ...)	Creates PDF files containing plots of species concentration
`make_benchmark_aerosol_tables`(devdir, devlist_aero, ...)	Compute FullChemBenchmark aerosol budgets & burdens
`make_benchmark_operations_budget`(refstr, reffiles, ...)	Prints the "operations budget" (i.e. change in mass after
`make_benchmark_mass_conservation_table`(datafiles, runstr)	Creates a text file containing global mass of the PassiveTracer
`get_input_res`(data)	Returns resolution of dataset passed to compare_single_level or compare_zonal_means
`get_vert_grid`(dataset[, AP, BP])	Determine vertical grid of input dataset
`get_grid_extents`(data[, edges])	Get min and max lat and lon from an input GEOS-Chem xarray dataset or grid dict
`make_regridder_S2S`(csres_in, csres_out[, sf_in, ...])	Create an xESMF regridder from a cubed-sphere / stretched-grid grid
`reformat_dims`(ds, format, towards_common)	Reformat dimensions of a cubed-sphere / stretched-grid grid between different GCHP formats
`make_regridder_L2S`(llres_in, csres_out[, weightsdir, ...])	Create an xESMF regridder from a lat/lon to a cubed-sphere grid
`make_regridder_C2L`(csres_in, llres_out[, weightsdir, ...])	Create an xESMF regridder from a cubed-sphere to lat/lon grid
`make_regridder_L2L`(llres_in, llres_out[, weightsdir, ...])	Create an xESMF regridder between two lat/lon grids
`reshape_MAPL_CS`(da)	Reshapes data if contains dimensions indicate MAPL v1.0.0+ output
`file_regrid`(fin, fout, dim_format_in, dim_format_out)	Regrids an input file to a new horizontal grid specification and saves it
`rename_restart_variables`(ds[, towards_gchp])	Renames restart variables according to GEOS-Chem Classic and GCHP conventions.
`drop_and_rename_classic_vars`(ds[, towards_gchp])	Renames and drops certain restart variables according to GEOS-Chem Classic
`rotate_vectors`(x, y, z, k, theta)
`cartesian_to_spherical`(x, y, z)	Convert cartesian coordinates to spherical in the form
`spherical_to_cartesian`(theta, phi[, r])	Convert spherical coordinates in the form (theta, phi[, r]) to
`schmidt_transform`(x, y, s)
`scs_transform`(x, y, s, tx, ty)

Attributes

`MW_AIR_g`
`_warning_format`
`_current_dir`
`_rgb_WhGrYlRd`
`WhGrYlRd`
`maindir`
`R_EARTH_m`
`_GEOS_72L_AP`
`_GEOS_72L_BP`
`GEOS_72L_grid`
`_GEOS_47L_AP`
`_GEOS_47L_BP`
`_xmat_i`
`_xmat_j`
`_xmat_s`
`_x_lev`
`_skip_size_vec`
`_number_lumped`
`_i_lev`
`_i_lev_72`
`_skip_size`
`_xmat_72to47`
`GEOS_47L_grid`
`_CAM_26L_AP`
`_CAM_26L_BP`
`CAM_26L_grid`
`_INV_SQRT_3`
`_ASIN_INV_SQRT_3`
`vec_latlon_to_cartesian`
`vec_cartesian_to_latlon`
`vec_spherical_to_cartesian`
`vec_cartesian_to_spherical`
`AVOGADRO`
`BOLTZ`
`G`
`R_EARTH_m`
`R_EARTH_km`
`MW_AIR_g`
`MW_AIR_kg`
`MW_H2O_kg`
`RD`
`RSTARG`
`RV`
`skip_these_vars`
`warning_format`
`aod_spc`
`spc_categories`
`emission_spc`
`emission_inv`
`parser`

gcpy.convert_lon(data, dim='lon', format='atlantic', neg_dateline=True)

Convert longitudes from -180..180 to 0..360, or vice-versa.

Args:

data: DataArray or Dataset: The container holding the data to be converted; the dimension indicated by ‘dim’ must be associated with this container

Keyword Args (optional):

dim: str: Name of dimension holding the longitude coordinates Default value: ‘lon’
format: str: Control whether or not to shift from -180..180 to 0..360 ( (‘pacific’) or from 0..360 to -180..180 (‘atlantic’) Default value: ‘atlantic’
neg_dateline: logical: If True, then the international dateline is set to -180 instead of 180. Default value: True

Returns:

data, with dimension ‘dim’ altered according to conversion rule

gcpy.get_emissions_varnames(commonvars, template=None)

Will return a list of emissions diagnostic variable names that contain a particular search string.

Args:

commonvars: list of strs: A list of commmon variable names from two data sets. (This can be obtained with method gcpy.util.compare_varnames)
template: str: String template for matching variable names corresponding to emission diagnostics by sector Default Value: None

Returns:

varnames: list of strs: A list of variable names corresponding to emission diagnostics for a given species and sector

gcpy.create_display_name(diagnostic_name)

Converts a diagnostic name to a more easily digestible name that can be used as a plot title or in a table of totals.

Args:

diagnostic_name: str: Name of the diagnostic to be formatted

Returns:

display_name: str: Formatted name that can be used as plot titles or in tables of emissions totals.

Remarks:

Assumes that diagnostic names will start with either “Emis” (for emissions by category) or “Inv” (for emissions by inventory). This should be an OK assumption to make since this routine is specifically geared towards model benchmarking.

gcpy.print_totals(ref, dev, f, masks=None)

Computes and prints Ref and Dev totals (as well as the difference Dev - Ref) for two xarray DataArray objects.

Args:

ref: xarray DataArray: The first DataArray to be compared (aka “Reference”)
dev: xarray DataArray: The second DataArray to be compared (aka “Development”)
f: file: File object denoting a text file where output will be directed.

Keyword Args (optional):

masks: dict of xarray DataArray: Dictionary containing the tropospheric mask arrays for Ref and Dev. If this keyword argument is passed, then print_totals will print tropospheric totals. Default value: None (i.e. print whole-atmosphere totals)

Remarks:

This is an internal method. It is meant to be called from method create_total_emissions_table or create_global_mass_table instead of being called directly.

gcpy.get_species_categories(benchmark_type='FullChemBenchmark')

Returns the list of benchmark categories that each species belongs to. This determines which PDF files will contain the plots for the various species.

Args:

benchmark_type: str: Specifies the type of the benchmark (either FullChemBenchmark (default) or TransportTracersBenchmark).

Returns:

spc_cat_dict: dict: A nested dictionary of categories (and sub-categories) and the species belonging to each.

NOTE: The benchmark categories are specified in YAML file benchmark_species.yml.

gcpy.archive_species_categories(dst)

Writes the list of benchmark categories to a YAML file named “benchmark_species.yml”.

Args:

dst: str: Name of the folder where the YAML file containing benchmark categories (“benchmark_species.yml”) will be written.

gcpy.add_bookmarks_to_pdf(pdfname, varlist, remove_prefix='', verbose=False)

Adds bookmarks to an existing PDF file.

Args:

pdfname: str: Name of an existing PDF file of species or emission plots to which bookmarks will be attached.
varlist: list: List of variables, which will be used to create the PDF bookmark names.

Keyword Args (optional):

remove_prefix: str

Specifies a prefix to remove from each entry in varlist when creating bookmarks. For example, if varlist has a variable name “SpeciesConc_NO”, and you specify remove_prefix=”SpeciesConc_”, then the bookmark for that variable will be just “NO”, etc.

verbose: bool: Set this flag to True to print extra informational output. Default value: False

gcpy.add_nested_bookmarks_to_pdf(pdfname, category, catdict, warninglist, remove_prefix='')

Add nested bookmarks to PDF.

Args:

pdfname: str: Path of PDF to add bookmarks to
category: str: Top-level key name in catdict that maps to contents of PDF
catdict: dictionary: Dictionary containing key-value pairs where one top-level key matches category and has value fully describing pages in PDF. The value is a dictionary where keys are level 1 bookmark names, and values are lists of level 2 bookmark names, with one level 2 name per PDF page. Level 2 names must appear in catdict in the same order as in the PDF.
warninglist: list of strings: Level 2 bookmark names to skip since not present in PDF.

Keyword Args (optional):

remove_prefix: str: Prefix to be remove from warninglist names before comparing with level 2 bookmark names in catdict. Default value: empty string (warninglist names match names in catdict)

gcpy.add_missing_variables(refdata, devdata, verbose=False, **kwargs)

Compares two xarray Datasets, “Ref”, and “Dev”. For each variable that is present in “Ref” but not in “Dev”, a DataArray of missing values (i.e. NaN) will be added to “Dev”. Similarly, for each variable that is present in “Dev” but not in “Ref”, a DataArray of missing values will be added to “Ref”. This routine is mostly intended for benchmark purposes, so that we can represent variables that were removed from a new GEOS-Chem version by missing values in the benchmark plots. NOTE: This function assuming incoming datasets have the same sizes and dimensions, which is not true if comparing datasets with different grid resolutions or types.

Args:

refdata: xarray Dataset: The “Reference” (aka “Ref”) dataset
devdata: xarray Dataset: The “Development” (aka “Dev”) dataset

Keyword Args (optional):

verbose: bool: Toggles extra debug print output Default value: False

Returns:

refdata, devdata: xarray Datasets: The returned “Ref” and “Dev” datasets, with placeholder missing value variables added

gcpy.reshape_MAPL_CS(da)

Reshapes data if contains dimensions indicate MAPL v1.0.0+ output Args:

da: xarray DataArray
Data array variable

Returns:

data: xarray DataArray: Data with dimensions renamed and transposed to match old MAPL format

gcpy.get_diff_of_diffs(ref, dev)

Generate datasets containing differences between two datasets

Args:

ref: xarray Dataset: The “Reference” (aka “Ref”) dataset.
dev: xarray Dataset: The “Development” (aka “Dev”) dataset

Returns:

absdiffs: xarray Dataset: Dataset containing dev-ref values
fracdiffs: xarray Dataset: Dataset containing dev/ref values

gcpy.slice_by_lev_and_time(ds, varname, itime, ilev, flip)

Slice a DataArray by desired time and level.

Args:

ds: xarray Dataset: Dataset containing GEOS-Chem data.
varname: str: Variable name for data variable to be sliced
itime: int: Index of time by which to slice
ilev: int: Index of level by which to slice
flip: bool: Whether to flip ilev to be indexed from ground or top of atmosphere

Returns:

ds[varname]: xarray DataArray: DataArray of data variable sliced according to ilev and itime

gcpy.rename_and_flip_gchp_rst_vars(ds)

Transforms a GCHP restart dataset to match GCC names and level convention

Args:

ds: xarray Dataset: Dataset containing GCHP restart file data, such as variables SPC_{species}, BXHEIGHT, DELP_DRY, and TropLev, with level convention down (level 0 is top-of-atmosphere).

Returns:

ds: xarray Dataset: Dataset containing GCHP restart file data with names and level convention matching GCC restart. Variables include SpeciesRst_{species}, Met_BXHEIGHT, Met_DELPDRY, and Met_TropLev, with level convention up (level 0 is surface).

gcpy.dict_diff(dict0, dict1)

Function to take the difference of two dict objects. Assumes that both objects have the same keys.

Args:

dict0, dict1: dict: Dictionaries to be subtracted (dict1 - dict0)

Returns:

result: dict: Key-by-key difference of dict1 - dict0

gcpy.compare_varnames(refdata, devdata, refonly=None, devonly=None, quiet=False)

Finds variables that are common to two xarray Dataset objects.

Args:

refdata: xarray Dataset: The first Dataset to be compared. (This is often referred to as the “Reference” Dataset.)
devdata: xarray Dataset: The second Dataset to be compared. (This is often referred to as the “Development” Dataset.)

Keyword Args (optional):

quiet: bool: Set this flag to True if you wish to suppress printing informational output to stdout. Default value: False

Returns:

vardict: dict of lists of str

Dictionary containing several lists of variable names: Key Value —– —– commonvars List of variables that are common to

both refdata and devdata

commonvarsOther List of variables that are common: to both refdata and devdata, but do not have lat, lon, and/or level dimensions (e.g. index variables).
commonvars2D List of variables that are common to: common to refdata and devdata, and that have lat and lon dimensions, but not level.
commonvars3D List of variables that are common to: refdata and devdata, and that have lat, lon, and level dimensions.
refonly List of 2D or 3D variables that are only: present in refdata.
devonly List of 2D or 3D variables that are only: present in devdata

gcpy.compare_stats(refdata, refstr, devdata, devstr, varname)

Prints out global statistics (array sizes, mean, min, max, sum) from two xarray Dataset objects.

Args:

refdata: xarray Dataset: The first Dataset to be compared. (This is often referred to as the “Reference” Dataset.)
refstr: str: Label for refdata to be used in the printout
devdata: xarray Dataset: The second Dataset to be compared. (This is often referred to as the “Development” Dataset.)
devstr: str: Label for devdata to be used in the printout
varname: str: Variable name for which global statistics will be printed out.

gcpy.convert_bpch_names_to_netcdf_names(ds, verbose=False)

Function to convert the non-standard bpch diagnostic names to names used in the GEOS-Chem netCDF diagnostic outputs.

Args:

ds: xarray Dataset: The xarray Dataset object whose names are to be replaced.

Keyword Args (optional):

verbose: bool: Set this flag to True to print informational output. Default value: False

Returns:

ds_new: xarray Dataset: A new xarray Dataset object all of the bpch-style diagnostic names replaced by GEOS-Chem netCDF names.

Remarks:

To add more diagnostic names, edit the dictionary contained in the bpch_to_nc_names.yml.

gcpy.get_lumped_species_definitions()

Returns lumped species definitions from a YAML file.

Returns:

lumped_spc_dictdict of str: Dictionary of lumped species

gcpy.archive_lumped_species_definitions(dst)

Archives lumped species definitions to a YAML file.

Args:

dststr: Name of the folder where the YAML file containing benchmark categories (“benchmark_species.yml”) will be written.

gcpy.add_lumped_species_to_dataset(ds, lspc_dict={}, lspc_yaml='', verbose=False, overwrite=False, prefix='SpeciesConc_')

Function to calculate lumped species concentrations and add them to an xarray Dataset. Lumped species definitions may be passed as a dictionary or a path to a yaml file. If neither is passed then the lumped species yaml file stored in gcpy is used. This file is customized for use with benchmark simuation SpeciesConc diagnostic collection output.

Args:

ds: xarray Dataset: An xarray Dataset object prior to adding lumped species.

Keyword Args (optional):

lspc_dict: dictionary: Dictionary containing list of constituent species and their integer scale factors per lumped species. Default value: False
lspc_yaml: str: Set this flag to True to print informational output. Default value: False
verbose: bool: Whether to print informational output. Default value: False
overwrite: bool: Whether to overwrite an existing species dataarray in a dataset if it has the same name as a new lumped species. If False and overlapping names are found then the function will raise an error. Default value: False
prefix: str: Prefix to prepend to new lumped species names. This argument is also used to extract an existing dataarray in the dataset with the correct size and dimensions to use during initialization of new lumped species dataarrays. Default value: “SpeciesConc_”

Returns:

ds_new: xarray Dataset: A new xarray Dataset object containing all of the original species plus new lumped species.

gcpy.filter_names(names, text='')

Returns elements in a list that match a given substring. Can be used in conjnction with compare_varnames to return a subset of variable names pertaining to a given diagnostic type or species.

Args:

names: list of str: Input list of names.
text: str: Target text string for restricting the search.

Returns:

filtered_names: list of str: Returns all elements of names that contains the substring specified by the “text” argument. If “text” is omitted, then the original contents of names will be returned.

gcpy.divide_dataset_by_dataarray(ds, dr, varlist=None)

Divides variables in an xarray Dataset object by a single DataArray object. Will also make sure that the Dataset variable attributes are preserved. This method can be useful for certain types of model diagnostics that have to be divided by a counter array. For example, local noontime J-value variables in a Dataset can be divided by the fraction of time it was local noon in each grid box, etc.

Args:

ds: xarray Dataset: The Dataset object containing variables to be divided.
dr: xarray DataArray: The DataArray object that will be used to divide the variables of ds.

Keyword Args (optional):

varlist: list of str: If passed, then only those variables of ds that are listed in varlist will be divided by dr. Otherwise, all variables of ds will be divided by dr. Default value: None

Returns:

ds_new: xarray Dataset: A new xarray Dataset object with its variables divided by dr.

gcpy.get_shape_of_data(data, vertical_dim='lev', return_dims=False)

Convenience routine to return a the shape (and dimensions, if requested) of an xarray Dataset, or xarray DataArray. Can also also take as input a dictionary of sizes (i.e. {‘time’: 1, ‘lev’: 72, …} from an xarray Dataset or xarray Datarray object.

Args:

data: xarray Dataset, xarray DataArray, or dict: The data for which the size is requested.

Keyword Args (optional):

vertical_dim: str: Specify the vertical dimension that you wish to return: lev or ilev. Default value: ‘lev’
return_dims: bool: Set this switch to True if you also wish to return a list of dimensions in the same order as the tuple of dimension sizes. Default value: False

Returns:

shape: tuple of int: Tuple containing the sizes of each dimension of dr in order: (time, lev|ilev, nf, lat|YDim, lon|XDim).
dims: list of str: If return_dims is True, then dims will contain a list of dimension names in the same order as shape ([‘time’, ‘lev’, ‘lat’, ‘lon’] for GEOS-Chem “Classic”,

or [‘time’, ‘lev’, ‘nf’, ‘Ydim’, ‘Xdim’] for GCHP.

gcpy.get_area_from_dataset(ds)

Convenience routine to return the area variable (which is usually called “AREA” for GEOS-Chem “Classic” or “Met_AREAM2” for GCHP) from an xarray Dataset object.

Args:

ds: xarray Dataset: The input dataset.

Returns:

area_m2: xarray DataArray: The surface area in m2, as found in ds.

gcpy.get_variables_from_dataset(ds, varlist)

Convenience routine to return multiple selected DataArray variables from an xarray Dataset. All variables must be found in the Dataset, or else an error will be raised.

Args:

ds: xarray Dataset: The input dataset.
varlist: list of str: List of DataArray variables to extract from ds.

Returns:

ds_subset: xarray Dataset: A new data set containing only the variables that were requested.

Remarks: Use this routine if you absolutely need all of the requested variables to be returned. Otherwise

gcpy.create_dataarray_of_nan(name, sizes, coords, attrs, vertical_dim='lev')

Given an xarray DataArray dr, returns a DataArray object with the same dimensions, coordinates, attributes, and name, but with its data set to missing values (NaN) everywhere. This is useful if you need to plot or compare two DataArray variables, and need to represent one as missing or undefined.

Args: name: str

The name for the DataArray object that will contain NaNs.

sizes: dict of int: Dictionary of the dimension names and their sizes (e.g. {‘time’: 1 ‘, ‘lev’: 72, …} that will be used to create the DataArray of NaNs. This can be obtained from an xarray Dataset as ds.sizes.
coords: dict of lists of float: Dictionary containing the coordinate variables that will be used to create the DataArray of NaNs. This can be obtained from an xarray Dataset with ds.coords.
attrs: dict of str: Dictionary containing the DataArray variable attributes (such as “units”, “long_name”, etc.). This can be obtained from an xarray Dataset with dr.attrs.

Returns: dr: xarray DataArray

The output DataArray object, which will contain NaN values everywhere. This will denote missing data.

gcpy.check_for_area(ds, gcc_area_name='AREA', gchp_area_name='Met_AREAM2')

Makes sure that a dataset has a surface area variable contained within it. GEOS-Chem Classic files all contain surface area as variable AREA. GCHP files do not and area must be retrieved from the met-field collection from variable Met_AREAM2. To simplify comparisons, the GCHP area name will be appended to the dataset under the GEOS-Chem “Classic” area name if it is present.

Args:

ds: xarray Dataset: The Dataset object that will be checked.

Keyword Args (optional):

gcc_area_name: str: Specifies the name of the GEOS-Chem “Classic” surface area varaible Default value: “AREA”
gchp_area_name: str: Specifies the name of the GCHP surface area variable. Default value: “Met_AREAM2”

Returns:

ds: xarray Dataset: The modified Dataset object

gcpy.get_filepath(datadir, col, date, is_gchp=False, gchp_format_is_legacy=False)

Routine to return file path for a given GEOS-Chem “Classic” (aka “GCC”) or GCHP diagnostic collection and date.

Args:

datadir: str: Path name of the directory containing GCC or GCHP data files.
col: str: Name of collection (e.g. Emissions, SpeciesConc, etc.) for which file path will be returned.
date: numpy.datetime64: Date for which file paths are requested.

Keyword Args (optional):

is_gchp: bool: Set this switch to True to obtain file pathnames to GCHP diagnostic data files. If False, assumes GEOS-Chem “Classic”
gchp_format_is_legacy: bool: Set this switch to True to obtain GCHP file pathnames of the legacy format for diagnostics, which do not match GC-Classic filenames. Set to False to use same format as GC-Classic.

Returns:

path: str: Pathname for the specified collection and date.

gcpy.get_filepaths(datadir, collections, dates, is_gchp=False, gchp_format_is_legacy=False)

Routine to return filepaths for a given GEOS-Chem “Classic” (aka “GCC”) or GCHP diagnostic collection.

Args:

datadir: str: Path name of the directory containing GCC or GCHP data files.
collections: list of str: Names of collections (e.g. Emissions, SpeciesConc, etc.) for which file paths will be returned.
dates: array of numpy.datetime64: Array of dates for which file paths are requested.

Keyword Args (optional):

is_gchp: bool: Set this switch to True to obtain file pathnames to GCHP diagnostic data files. If False, assumes GEOS-Chem “Classic”
gchp_format_is_legacy: bool: Set this switch to True to obtain GCHP file pathnames of the legacy format for diagnostics, which do not match GC-Classic filenames. Set to False to use same format as GC-Classic.

Returns:

paths: 2D list of str: A list of pathnames for each specified collection and date. First dimension is collection, and second is date.

gcpy.extract_pathnames_from_log(filename, prefix_filter='')

Returns a list of pathnames from a GEOS-Chem log file. This can be used to get a list of files that should be downloaded from gcgrid or from Amazon S3.

Args:

filename: str: GEOS-Chem standard log file
prefix_filter (optional): str: Restricts the output to file paths starting with this prefix (e.g. “/home/ubuntu/ExtData/HEMCO/”) Default value: ‘’

Returns:

data list: list of str: List of full pathnames of data files found in the log file.

Author:

Jiawei Zhuang (jiaweizhuang@g.harvard.edu)

gcpy.get_gcc_filepath(outputdir, collection, day, time)

Routine for getting filepath of GEOS-Chem Classic output

Args:

outputdir: str: Path of the OutputDir directory
collection: str: Name of output collection, e.g. Emissions or SpeciesConc
day: str: Number day of output, e.g. 31
time: str: Z time of output, e.g. 1200z

Returns:

filepath: str: Path of requested file

gcpy.get_gchp_filepath(outputdir, collection, day, time)

Routine for getting filepath of GCHP output

Args:

outputdir: str: Path of the OutputDir directory
collection: str: Name of output collection, e.g. Emissions or SpeciesConc
day: str: Number day of output, e.g. 31
time: str: Z time of output, e.g. 1200z

Returns:

filepath: str: Path of requested file

gcpy.get_nan_mask(data)

Create a mask with NaN values removed from an input array

Args:

data: numpy array: Input array possibly containing NaNs

Returns:

new_data: numpy array: Original array with NaN values removed

gcpy.all_zero_or_nan(ds)

Return whether ds is all zeros, or all nans

Args:

ds: numpy array: Input GEOS-Chem data

Returns:

all_zero, all_nan: bool, bool: All_zero is whether ds is all zeros, all_nan is whether ds i s all NaNs

gcpy.dataset_mean(ds, dim='time', skipna=True)

Convenience wrapper for taking the mean of an xarray Dataset.

Args:

dsxarray Dataset: Input data

Keyword Args:

dimstr: Dimension over which the mean will be taken. Default: “time”
skipnabool: Flag to omit missing values from the mean. Default: True

Returns:

ds_meanxarray Dataset or None: Dataset containing mean values Will return None if ds is not defined

gcpy.dataset_reader(multi_files)

Returns a function to read an xarray Dataset.

Args:

multi_filesbool: Denotes whether we will be reading multiple files into an xarray Dataset. Default value: False

Returns:

reader : either xr.open_mfdataset or xr.open_dataset

gcpy.read_config_file(config_file): Reads configuration information from a YAML file.

gcpy.get_timestamp_string(date_array)

Convenience function returning the datetime timestamp based on the given input

Args:

date_array: array: Array of integers corresponding to [year, month, day, hour, minute, second]. Any integers not provided will be padded accordingly

Returns:

date_str: string: string in datetime format (eg. 2019-01-01T00:00:00Z)

gcpy.add_months(start_date, n_months)

Args:

start_date: numpy.datetime64: numpy datetime64 object

n_months: integer

Returns:

new_date: numpy.datetime64: numpy datetime64 object with exactly n_months added to the date

gcpy.is_full_year(start_date, end_date)

Verifies if two dates are a full year starting Jan 1.

Args:

start_date: numpy.datetime64: numpy datetime64 object
end_date: numpy.datetime64: numpy datetime64 object

Returns: boolean

gcpy.adjust_units(units)

Creates a consistent unit string that will be used in the unit conversion routines below.

Args:

units: str: Input unit string.

Returns:

adjusted_units: str: Output unit string, adjusted to a consistent value.

Remarks:

Unit list is incomplete – currently is geared to units from common model diagnostics (e.g. kg/m2/s, kg, and variants).

gcpy.convert_kg_to_target_units(data_kg, target_units, kg_to_kgC)

Converts a data array from kg to one of several types of target units.

Args:

data_kg: numpy ndarray: Input data array, in units of kg.
target_units: str: String containing the name of the units to which the “data_kg” argument will be converted. Examples: ‘Tg’, ‘Tg C’, ‘Mg’, ‘Mg C’, ‘kg, ‘kg C’, etc.
kg_to_kg_C: float: Conversion factor from kg to kg carbon.

Returns:

data: numpy ndarray: Ouptut data array, converted to the units specified by the ‘target_units’ argument.

Remarks:

At present, only those unit conversions corresponding to the GEOS-Chem benchmarks have been implemented.

This is an internal routine, which is meant to be called directly from convert_units.

gcpy.convert_units(dr, species_name, species_properties, target_units, interval=[2678400.0], area_m2=None, delta_p=None, box_height=None)

Converts data stored in an xarray DataArray object from its native units to a target unit.

Args:

dr: xarray DataArray: Data to be converted from native units to target units.
species_name: str: Name of the species corresponding to the data stored in “dr”.
species_properties: dict: Dictionary containing species properties (e.g. molecular weights and other metadata) for the given species.
target_units: str: Units to which the data will be converted.

Keyword Args (optional):

interval: float: The length of the averaging period in seconds. Default value: [2678400.0]
area_m2: xarray DataArray: Surface area in square meters Default value: None
delta_p: xarray DataArray: Delta-pressure between top and bottom edges of grid box (dry air) in hPa Default value: None
box_height: xarray DataArray: Grid box height in meters Default value: None

Returns:

dr_new: xarray DataArray: Data converted to target units.

Remarks:

At present, only certain types of unit conversions have been implemented (corresponding to the most commonly used unit conversions for model benchmark output).

When molmol-1 is present as unit, assumes dry air.

gcpy.check_units(ref_da, dev_da, enforce_units=True)

Ensures the units of two xarray DataArrays are the same.

Args:

ref_da: xarray DataArray: First data array containing a units attribute.
dev_da: xarray DataArray: Second data array containing a units attribute.

Keyword Args (optional):

enforce_units: bool: Whether to stop program if ref and dev units do not match Default value: True

Returns:

units_match: bool

gcpy.data_unit_is_mol_per_mol(da)

Check if the units of an xarray DataArray are mol/mol based on a set list of unit strings mol/mol may be.

Args:

da: xarray DataArray: Data array containing a units attribute

Returns:

is_molmol: bool: Whether input units are mol/mol

class gcpy._GlobVars(devstr, files, dst, year, bmk_type, species, overwrite, month): Private class _GlobVars contains global data that needs to be shared among the methods in this module.

gcpy.compute_ste(globvars)

Computes the strat-trop-exchange, taken as species flux across the 100hPa pressure level.

Args:

globvars: obj of type _GlobVars: Global variables needed for budget computations.

Returns:

result: Pandas DataFrame: Strat-trop fluxes

gcpy.print_ste(globvars, df)

Prints the strat-trop exchange table.

Args:

globvars: _GlobVars: Global variables
df: pandas DataFrame: Strat-trop exchange table

gcpy.make_benchmark_ste_table(devstr, files, year, dst='./1yr_benchmark', bmk_type='FullChemBenchmark', species=['O3'], overwrite=True, month=None)

Driver program. Computes and prints strat-trop exchange for the selected species and benchmark year.

Args:

devstr: str: Label denoting the “Dev” version.
files: str: List of files containing vertical fluxes.
year: str: Year of the benchmark simulation.

Keyword Args (optional):

dst: str: Directory where plots & tables will be created.
bmk_type: str: FullChemBenchmark or TransportTracersBenchmark.
species: list of str: Species for which STE fluxes are desired.
overwrite: bool: Denotes whether to ovewrite existing budget tables.
month: float: If passed, specifies the month of a 1-month benchmark. Default: None (denotes a 1-year benchmark)

gcpy.make_grid_LL(llres, in_extent=[-180, 180, -90, 90], out_extent=[])

Creates a lat/lon grid description.

Args:

llres: str: lat/lon resolution in ‘latxlon’ format (e.g. ‘4x5’)

Keyword Args (optional):

in_extent: list[float, float, float, float]: Describes minimum and maximum latitude and longitude of initial grid in the format [minlon, maxlon, minlat, maxlat] Default value: [-180, 180, -90, 90]
out_extent: list[float, float, float, float]: Describes minimum and maximum latitude and longitude of target grid in the format [minlon, maxlon, minlat, maxlat]. Needed when intending to use grid to trim extent of input data Default value: [] (assumes value of in_extent)

Returns:

llgrid: dict

dict grid description of format {‘lat’lat midpoints,: ‘lon’ : lon midpoints, ‘lat_b’ : lat edges, ‘lon_b’ : lon edges}

gcpy.make_grid_CS(csres)

Creates a cubed-sphere grid description.

Args:

csres: int: cubed-sphere resolution of target grid

Returns:

[csgrid, csgrid_list]: list[dict, list[dict]]

csgrid is a dict of format {‘lat’lat midpoints,: ‘lon’ : lon midpoints, ‘lat_b’ : lat edges, ‘lon_b’ : lon edges}

where each value has an extra face dimension of length 6. csgrid_list is a list of dicts separated by face index

gcpy.make_grid_SG(csres, stretch_factor, target_lon, target_lat)

Creates a stretched-grid grid description.

Args:

csres: int: cubed-sphere resolution of target grid
stretch_factor: float: stretch factor of target grid
target_lon: float: target stretching longitude of target grid
target_lon: float: target stretching latitude of target grid

Returns:

[csgrid, csgrid_list]: list[dict, list[dict]]

csgrid is a dict of format {‘lat’lat midpoints,: ‘lon’ : lon midpoints, ‘lat_b’ : lat edges, ‘lon_b’ : lon edges}

where each value has an extra face dimension of length 6. csgrid_list is a list of dicts separated by face index

gcpy.get_input_res(data)

Returns resolution of dataset passed to compare_single_level or compare_zonal_means

Args:

data: xarray Dataset: Input GEOS-Chem dataset

Returns:

res: str or int: Lat/lon res of the form ‘latresxlonres’ or cubed-sphere resolution
gridtype: str: ‘ll’ for lat/lon or ‘cs’ for cubed-sphere

gcpy.call_make_grid(res, gridtype, in_extent=[-180, 180, -90, 90], out_extent=[-180, 180, -90, 90], sg_params=[1, 170, -90])

Create a mask with NaN values removed from an input array

Args:

res: str or int: Resolution of grid (format ‘latxlon’ or csres)
gridtype: str: ‘ll’ for lat/lon or ‘cs’ for cubed-sphere

Keyword Args (optional):

in_extent: list[float, float, float, float]: Describes minimum and maximum latitude and longitude of input data in the format [minlon, maxlon, minlat, maxlat] Default value: [-180, 180, -90, 90]
out_extent: list[float, float, float, float]: Desired minimum and maximum latitude and longitude of output grid in the format [minlon, maxlon, minlat, maxlat] Default value: [-180, 180, -90, 90]
sg_params: list[float, float, float] (stretch_factor, target_longitude, target_latitude): Desired stretched-grid parameters in the format [stretch_factor, target_longitude, target_latitude]. Will trigger stretched-grid creation if not default values. Default value: [1, 170, -90] (no stretching)

Returns:

[grid, grid_list]: list(dict, list(dict)): Returns the created grid. grid_list is a list of grids if gridtype is ‘cs’, else it is None

gcpy.get_grid_extents(data, edges=True)

Get min and max lat and lon from an input GEOS-Chem xarray dataset or grid dict

Args:

data: xarray Dataset or dict: A GEOS-Chem dataset or a grid dict
edges (optional): bool: Whether grid extents should use cell edges instead of centers Default value: True

Returns:

minlon: float: Minimum longitude of data grid
maxlon: float: Maximum longitude of data grid
minlat: float: Minimum latitude of data grid
maxlat: float: Maximum latitude of data grid

gcpy.get_vert_grid(dataset, AP=[], BP=[])

Determine vertical grid of input dataset

Args:

dataset: xarray Dataset: A GEOS-Chem output dataset

Keyword Args (optional):

AP: list-like type: Hybrid grid parameter A in hPa Default value: []
BP: list-like type: Hybrid grid parameter B (unitless) Default value: []

Returns:

p_edge: numpy array: Edge pressure values for vertical grid
p_mid: numpy array: Midpoint pressure values for vertical grid
nlev: int: Number of levels in vertical grid

gcpy.make_regridder_L2L(llres_in, llres_out, weightsdir='.', reuse_weights=False, in_extent=[-180, 180, -90, 90], out_extent=[-180, 180, -90, 90])

Create an xESMF regridder between two lat/lon grids

Args:

llres_in: str: Resolution of input grid in format ‘latxlon’, e.g. ‘4x5’
llres_out: str: Resolution of output grid in format ‘latxlon’, e.g. ‘4x5’

Keyword Args (optional):

weightsdir: str: Directory in which to create xESMF regridder NetCDF files Default value: ‘.’
reuse_weights: bool: Set this flag to True to reuse existing xESMF regridder NetCDF files Default value: False
in_extent: list[float, float, float, float]: Describes minimum and maximum latitude and longitude of input grid in the format [minlon, maxlon, minlat, maxlat] Default value: [-180, 180, -90, 90]
out_extent: list[float, float, float, float]: Desired minimum and maximum latitude and longitude of output grid in the format [minlon, maxlon, minlat, maxlat] Default value: [-180, 180, -90, 90]

Returns:

regridder: xESMF regridder: regridder object between the two specified grids

gcpy.make_regridder_C2L(csres_in, llres_out, weightsdir='.', reuse_weights=True, sg_params=[1, 170, -90])

Create an xESMF regridder from a cubed-sphere to lat/lon grid

Args:

csres_in: int: Cubed-sphere resolution of input grid
llres_out: str: Resolution of output grid in format ‘latxlon’, e.g. ‘4x5’

Keyword Args (optional):

weightsdir: str: Directory in which to create xESMF regridder NetCDF files Default value: ‘.’
reuse_weights: bool: Set this flag to True to reuse existing xESMF regridder NetCDF files Default value: False
sg_params: list[float, float, float] (stretch_factor, target_longitude, target_latitude): Input grid stretched-grid parameters in the format [stretch_factor, target_longitude, target_latitude]. Will trigger stretched-grid creation if not default values. Default value: [1, 170, -90] (no stretching)

Returns:

regridder_list: list[6 xESMF regridders]: list of regridder objects (one per cubed-sphere face) between the two specified grids

gcpy.make_regridder_S2S(csres_in, csres_out, sf_in=1, tlon_in=170, tlat_in=-90, sf_out=1, tlon_out=170, tlat_out=-90, weightsdir='.', verbose=True)

Create an xESMF regridder from a cubed-sphere / stretched-grid grid to another cubed-sphere / stretched-grid grid. Stretched-grid params of 1, 170, -90 indicate no stretching.

Args:

csres_in: int: Cubed-sphere resolution of input grid
csres_out: int: Cubed-sphere resolution of output grid

Keyword Args (optional):

sf_in: float: Stretched-grid factor of input grid Default value: 1
tlon_in: float: Target longitude for stretching in input grid Default value: 170
tlat_in: float: Target longitude for stretching in input grid Default value: -90
sf_out: float: Stretched-grid factor of output grid Default value: 1
tlon_out: float: Target longitude for stretchingg in output grid Default value: 170
tlat_out: float: Target longitude for stretching in output grid Default value: -90
weightsdir: str: Directory in which to create xESMF regridder NetCDF files Default value: ‘.’
verbose: bool: Set this flag to True to enable printing when output faces do not intersect input faces when regridding Default value: True

Returns:

regridder_list: list[6 xESMF regridders]: list of regridder objects (one per cubed-sphere face) between the two specified grids

gcpy.make_regridder_L2S(llres_in, csres_out, weightsdir='.', reuse_weights=True, sg_params=[1, 170, -90])

Create an xESMF regridder from a lat/lon to a cubed-sphere grid

Args:

llres_in: str: Resolution of input grid in format ‘latxlon’, e.g. ‘4x5’
csres_out: int: Cubed-sphere resolution of output grid

Keyword Args (optional):

weightsdir: str: Directory in which to create xESMF regridder NetCDF files Default value: ‘.’
reuse_weights: bool: Set this flag to True to reuse existing xESMF regridder NetCDF files Default value: False
sg_params: list[float, float, float] (stretch_factor, target_longitude, target_latitude): Output grid stretched-grid parameters in the format [stretch_factor, target_longitude, target_latitude]. Will trigger stretched-grid creation if not default values. Default value: [1, 170, -90] (no stretching)

Returns:

regridder_list: list[6 xESMF regridders]: list of regridder objects (one per cubed-sphere face) between the two specified grids

gcpy.create_regridders(refds, devds, weightsdir='.', reuse_weights=True, cmpres=None, zm=False, sg_ref_params=[1, 170, -90], sg_dev_params=[1, 170, -90])

Internal function used for creating regridders between two datasets. Follows decision logic needed for plotting functions. Originally code from compare_single_level and compare_zonal_mean.

Args:

refds: xarray Dataset: Input dataset
devds: xarray Dataset: Output dataset

Keyword Args (optional):

weightsdir: str: Directory in which to create xESMF regridder NetCDF files Default value: ‘.’
reuse_weights: bool: Set this flag to True to reuse existing xESMF regridder NetCDF files Default value: False
cmpres: int or str: Specific target resolution for comparison grid used in difference and ratio plots Default value: None (will follow logic chain below)
zm: bool: Set this flag to True if regridders will be used in zonal mean plotting Default value: False
sg_ref_params: list[float, float, float] (stretch_factor, target_longitude, target_latitude): Ref grid stretched-grid parameters in the format [stretch_factor, target_longitude, target_latitude]. Default value: [1, 170, -90] (no stretching)
sg_dev_params: list[float, float, float] (stretch_factor, target_longitude, target_latitude): Dev grid stretched-grid parameters in the format [stretch_factor, target_longitude, target_latitude]. Default value: [1, 170, -90] (no stretching)

Returns:

list of many different quantities needed for regridding in plotting functions

refres, devres, cmpres: bool: Resolution of a dataset grid
refgridtype, devgridtype, cmpgridtype: str: Gridtype of a dataset (‘ll’ or ‘cs’)
regridref, regriddev, regridany: bool: Whether to regrid a dataset
refgrid, devgrid, cmpgrid: dict: Grid definition of a dataset
refregridder, devregridder: xESMF regridder: Regridder object between refgrid or devgrid and cmpgrid (will be None if input grid is not lat/lon)
refregridder_list, devregridder_list: list[6 xESMF regridders]: List of regridder objects for each face between refgrid or devgrid and cmpgrid (will be None if input grid is not cubed-sphere)

gcpy.regrid_comparison_data(data, res, regrid, regridder, regridder_list, global_cmp_grid, gridtype, cmpgridtype, cmpminlat_ind=0, cmpmaxlat_ind=-2, cmpminlon_ind=0, cmpmaxlon_ind=-2, nlev=1)

Regrid comparison datasets to cubed-sphere (including stretched-grid) or lat/lon format.

Args:

data: xarray DataArray: DataArray containing a GEOS-Chem data variable
res: int: Cubed-sphere resolution for comparison grid
regrid: bool: Set to true to regrid dataset
regridder: xESMF regridder: Regridder between the original data grid and the comparison grid
regridder_list: list(xESMF regridder): List of regridders for cubed-sphere data
global_cmp_grid: xarray DataArray: Comparison grid
gridtype: str: Type of input data grid (either ‘ll’ or ‘cs’)
cmpgridtype: str: Type of input data grid (either ‘ll’ or ‘cs’)

Keyword Args (optional):

cmpminlat_ind: int: Index of minimum latitude extent for comparison grid Default value: 0
cmpmaxlat_ind: int: Index (minus 1) of maximum latitude extent for comparison grid Default value: -2
cmpminlon_ind: int: Index of minimum longitude extent for comparison grid Default value: 0
cmpmaxlon_ind: int: Index (minus 1) of maximum longitude extent for comparison grid Default value: -2
nlev: int: Number of levels of input grid and comparison grid Default value: 1

Returns:

data: xarray DataArray: Original DataArray regridded to comparison grid (including resolution and extent changes)

gcpy.reformat_dims(ds, format, towards_common)

Reformat dimensions of a cubed-sphere / stretched-grid grid between different GCHP formats

Args:

ds: xarray Dataset: Dataset to be reformatted
format: str: Format from or to which to reformat (‘checkpoint’ or ‘diagnostic’)
towards_common: bool: Set this flag to True to move towards a common dimension format

Returns:

ds: xarray Dataset: Original dataset with reformatted dimensions

gcpy.sg_hash(cs_res, stretch_factor: float, target_lat: float, target_lon: float)

gcpy.regrid_vertical_datasets(ref, dev, target_grid_choice='ref', ref_vert_params=[[], []], dev_vert_params=[[], []], target_vert_params=[[], []])

Perform complete vertical regridding of GEOS-Chem datasets to the vertical grid of one of the datasets or an entirely different vertical grid.

Args:

ref: xarray.Dataset: First dataset
dev: xarray.Dataset: Second dataset
target_grid_choice (optional): str: Will regrid to the chosen dataset among the two datasets unless target_vert_params is provided Default value: ‘ref’
ref_vert_params (optional): list(list, list) of list-like types: Hybrid grid parameter A in hPa and B (unitless) in [AP, BP] format. Needed if ref grid is not 47 or 72 levels Default value: [[], []]
dev_vert_params (optional): list(list, list) of list-like types: Hybrid grid parameter A in hPa and B (unitless) in [AP, BP] format. Needed if dev grid is not 47 or 72 levels Default value: [[], []]
target_vert_params (optional): list(list, list) of list-like types: Hybrid grid parameter A in hPa and B (unitless) in [AP, BP] format. Will override target_grid_choice as target grid Default value: [[], []]

Returns:

new_ref: xarray.Dataset: First dataset, possibly regridded to a new vertical grid
new_dev: xarray.Dataset: Second dataset, possibly regridded to a new vertical grid

gcpy.regrid_vertical(src_data_3D, xmat_regrid, target_levs=[])

Performs vertical regridding using a sparse regridding matrix This function was originally written by Sebastian Eastham and included in package gcgridobj: https://github.com/sdeastham/gcgridobj

Args:

src_data_3D: xarray DataArray or numpy array: Data to be regridded
xmat_regrid: sparse scipy coordinate matrix: Regridding matrix from input data grid to target grid
target_levs (optional): list: Values for Z coordinate of returned data (if returned data is of type xr.DataArray) Default value: []

Returns:

out_data: xarray DataArray or numpy array: Data regridded to target grid

gcpy.gen_xmat(p_edge_from, p_edge_to)

Generates regridding matrix from one vertical grid to another. This function was originally written by Sebastian Eastham and included in package gcgridobj: https://github.com/sdeastham/gcgridobj

Args:

p_edge_from: numpy array: Edge pressures of the input grid
p_edge_to: numpy array: Edge pressures of the target grid

Returns:

xmat: sparse scipy coordinate matrix: Regridding matrix from input grid to target grid

gcpy.get_vert_grid(dataset, AP=[], BP=[])

Determine vertical grid of input dataset

Args:

dataset: xarray Dataset: A GEOS-Chem output dataset

Keyword Args (optional):

AP: list-like type: Hybrid grid parameter A in hPa Default value: []
BP: list-like type: Hybrid grid parameter B (unitless) Default value: []

Returns:

p_edge: numpy array: Edge pressure values for vertical grid
p_mid: numpy array: Midpoint pressure values for vertical grid
nlev: int: Number of levels in vertical grid

gcpy.get_pressure_indices(pedge, pres_range)

Get indices where edge pressure values are within a given pressure range

Args:

pedge: numpy array: A GEOS-Chem output dataset
pres_range: list(float, float): Contains minimum and maximum pressure

Returns:

numpy array: Indices where edge pressure values are within a given pressure range

gcpy.pad_pressure_edges(pedge_ind, max_ind, pmid_len)

Add outer indices to edge pressure index list

Args:

pedge_ind: list: List of edge pressure indices
max_ind: int: Maximum index
pmid_len: int: Length of pmid which should not be exceeded by indices

Returns:

pedge_ind: list: List of edge pressure indices, possibly with new minimum and maximum indices

gcpy.convert_lev_to_pres(dataset, pmid, pedge, lev_type='pmid')

Convert lev dimension to pressure in a GEOS-Chem dataset

Args:

dataset: xarray Dataset: GEOS-Chem dataset
pmid: np.array: Midpoint pressure values
pedge: np.array: Edge pressure values
lev_type (optional): str: Denote whether lev is ‘pedge’ or ‘pmid’ if grid is not 72/73 or 47/48 levels Default value: ‘pmid’

Returns:

dataset: xarray Dataset: Input dataset with “lev” dimension values replaced with pressure values

gcpy.get_grid_extents(data, edges=True)

Get min and max lat and lon from an input GEOS-Chem xarray dataset or grid dict

Args:

data: xarray Dataset or dict: A GEOS-Chem dataset or a grid dict
edges (optional): bool: Whether grid extents should use cell edges instead of centers Default value: True

Returns:

minlon: float: Minimum longitude of data grid
maxlon: float: Maximum longitude of data grid
minlat: float: Minimum latitude of data grid
maxlat: float: Maximum latitude of data grid

gcpy.call_make_grid(res, gridtype, in_extent=[-180, 180, -90, 90], out_extent=[-180, 180, -90, 90], sg_params=[1, 170, -90])

Create a mask with NaN values removed from an input array

Args:

res: str or int: Resolution of grid (format ‘latxlon’ or csres)
gridtype: str: ‘ll’ for lat/lon or ‘cs’ for cubed-sphere

Keyword Args (optional):

in_extent: list[float, float, float, float]: Describes minimum and maximum latitude and longitude of input data in the format [minlon, maxlon, minlat, maxlat] Default value: [-180, 180, -90, 90]
out_extent: list[float, float, float, float]: Desired minimum and maximum latitude and longitude of output grid in the format [minlon, maxlon, minlat, maxlat] Default value: [-180, 180, -90, 90]
sg_params: list[float, float, float] (stretch_factor, target_longitude, target_latitude): Desired stretched-grid parameters in the format [stretch_factor, target_longitude, target_latitude]. Will trigger stretched-grid creation if not default values. Default value: [1, 170, -90] (no stretching)

Returns:

[grid, grid_list]: list(dict, list(dict)): Returns the created grid. grid_list is a list of grids if gridtype is ‘cs’, else it is None

gcpy.get_input_res(data)

Returns resolution of dataset passed to compare_single_level or compare_zonal_means

Args:

data: xarray Dataset: Input GEOS-Chem dataset

Returns:

res: str or int: Lat/lon res of the form ‘latresxlonres’ or cubed-sphere resolution
gridtype: str: ‘ll’ for lat/lon or ‘cs’ for cubed-sphere

gcpy.regrid_comparison_data(data, res, regrid, regridder, regridder_list, global_cmp_grid, gridtype, cmpgridtype, cmpminlat_ind=0, cmpmaxlat_ind=-2, cmpminlon_ind=0, cmpmaxlon_ind=-2, nlev=1)

Regrid comparison datasets to cubed-sphere (including stretched-grid) or lat/lon format.

Args:

data: xarray DataArray: DataArray containing a GEOS-Chem data variable
res: int: Cubed-sphere resolution for comparison grid
regrid: bool: Set to true to regrid dataset
regridder: xESMF regridder: Regridder between the original data grid and the comparison grid
regridder_list: list(xESMF regridder): List of regridders for cubed-sphere data
global_cmp_grid: xarray DataArray: Comparison grid
gridtype: str: Type of input data grid (either ‘ll’ or ‘cs’)
cmpgridtype: str: Type of input data grid (either ‘ll’ or ‘cs’)

Keyword Args (optional):

cmpminlat_ind: int: Index of minimum latitude extent for comparison grid Default value: 0
cmpmaxlat_ind: int: Index (minus 1) of maximum latitude extent for comparison grid Default value: -2
cmpminlon_ind: int: Index of minimum longitude extent for comparison grid Default value: 0
cmpmaxlon_ind: int: Index (minus 1) of maximum longitude extent for comparison grid Default value: -2
nlev: int: Number of levels of input grid and comparison grid Default value: 1

Returns:

data: xarray DataArray: Original DataArray regridded to comparison grid (including resolution and extent changes)

gcpy.create_regridders(refds, devds, weightsdir='.', reuse_weights=True, cmpres=None, zm=False, sg_ref_params=[1, 170, -90], sg_dev_params=[1, 170, -90])

Internal function used for creating regridders between two datasets. Follows decision logic needed for plotting functions. Originally code from compare_single_level and compare_zonal_mean.

Args:

refds: xarray Dataset: Input dataset
devds: xarray Dataset: Output dataset

Keyword Args (optional):

weightsdir: str: Directory in which to create xESMF regridder NetCDF files Default value: ‘.’
reuse_weights: bool: Set this flag to True to reuse existing xESMF regridder NetCDF files Default value: False
cmpres: int or str: Specific target resolution for comparison grid used in difference and ratio plots Default value: None (will follow logic chain below)
zm: bool: Set this flag to True if regridders will be used in zonal mean plotting Default value: False
sg_ref_params: list[float, float, float] (stretch_factor, target_longitude, target_latitude): Ref grid stretched-grid parameters in the format [stretch_factor, target_longitude, target_latitude]. Default value: [1, 170, -90] (no stretching)
sg_dev_params: list[float, float, float] (stretch_factor, target_longitude, target_latitude): Dev grid stretched-grid parameters in the format [stretch_factor, target_longitude, target_latitude]. Default value: [1, 170, -90] (no stretching)

Returns:

list of many different quantities needed for regridding in plotting functions

refres, devres, cmpres: bool: Resolution of a dataset grid
refgridtype, devgridtype, cmpgridtype: str: Gridtype of a dataset (‘ll’ or ‘cs’)
regridref, regriddev, regridany: bool: Whether to regrid a dataset
refgrid, devgrid, cmpgrid: dict: Grid definition of a dataset
refregridder, devregridder: xESMF regridder: Regridder object between refgrid or devgrid and cmpgrid (will be None if input grid is not lat/lon)
refregridder_list, devregridder_list: list[6 xESMF regridders]: List of regridder objects for each face between refgrid or devgrid and cmpgrid (will be None if input grid is not cubed-sphere)

gcpy.gen_xmat(p_edge_from, p_edge_to)

Generates regridding matrix from one vertical grid to another. This function was originally written by Sebastian Eastham and included in package gcgridobj: https://github.com/sdeastham/gcgridobj

Args:

p_edge_from: numpy array: Edge pressures of the input grid
p_edge_to: numpy array: Edge pressures of the target grid

Returns:

xmat: sparse scipy coordinate matrix: Regridding matrix from input grid to target grid

gcpy.regrid_vertical(src_data_3D, xmat_regrid, target_levs=[])

Performs vertical regridding using a sparse regridding matrix This function was originally written by Sebastian Eastham and included in package gcgridobj: https://github.com/sdeastham/gcgridobj

Args:

src_data_3D: xarray DataArray or numpy array: Data to be regridded
xmat_regrid: sparse scipy coordinate matrix: Regridding matrix from input data grid to target grid
target_levs (optional): list: Values for Z coordinate of returned data (if returned data is of type xr.DataArray) Default value: []

Returns:

out_data: xarray DataArray or numpy array: Data regridded to target grid

gcpy.reshape_MAPL_CS(da)

Reshapes data if contains dimensions indicate MAPL v1.0.0+ output Args:

da: xarray DataArray
Data array variable

Returns:

data: xarray DataArray: Data with dimensions renamed and transposed to match old MAPL format

gcpy.get_diff_of_diffs(ref, dev)

Generate datasets containing differences between two datasets

Args:

ref: xarray Dataset: The “Reference” (aka “Ref”) dataset.
dev: xarray Dataset: The “Development” (aka “Dev”) dataset

Returns:

absdiffs: xarray Dataset: Dataset containing dev-ref values
fracdiffs: xarray Dataset: Dataset containing dev/ref values

gcpy.get_nan_mask(data)

Create a mask with NaN values removed from an input array

Args:

data: numpy array: Input array possibly containing NaNs

Returns:

new_data: numpy array: Original array with NaN values removed

gcpy.all_zero_or_nan(ds)

Return whether ds is all zeros, or all nans

Args:

ds: numpy array: Input GEOS-Chem data

Returns:

all_zero, all_nan: bool, bool: All_zero is whether ds is all zeros, all_nan is whether ds i s all NaNs

gcpy.slice_by_lev_and_time(ds, varname, itime, ilev, flip)

Slice a DataArray by desired time and level.

Args:

ds: xarray Dataset: Dataset containing GEOS-Chem data.
varname: str: Variable name for data variable to be sliced
itime: int: Index of time by which to slice
ilev: int: Index of level by which to slice
flip: bool: Whether to flip ilev to be indexed from ground or top of atmosphere

Returns:

ds[varname]: xarray DataArray: DataArray of data variable sliced according to ilev and itime

gcpy.compare_varnames(refdata, devdata, refonly=None, devonly=None, quiet=False)

Finds variables that are common to two xarray Dataset objects.

Args:

refdata: xarray Dataset: The first Dataset to be compared. (This is often referred to as the “Reference” Dataset.)
devdata: xarray Dataset: The second Dataset to be compared. (This is often referred to as the “Development” Dataset.)

Keyword Args (optional):

quiet: bool: Set this flag to True if you wish to suppress printing informational output to stdout. Default value: False

Returns:

vardict: dict of lists of str

Dictionary containing several lists of variable names: Key Value —– —– commonvars List of variables that are common to

both refdata and devdata

commonvarsOther List of variables that are common: to both refdata and devdata, but do not have lat, lon, and/or level dimensions (e.g. index variables).
commonvars2D List of variables that are common to: common to refdata and devdata, and that have lat and lon dimensions, but not level.
commonvars3D List of variables that are common to: refdata and devdata, and that have lat, lon, and level dimensions.
refonly List of 2D or 3D variables that are only: present in refdata.
devonly List of 2D or 3D variables that are only: present in devdata

gcpy.check_units(ref_da, dev_da, enforce_units=True)

Ensures the units of two xarray DataArrays are the same.

Args:

ref_da: xarray DataArray: First data array containing a units attribute.
dev_da: xarray DataArray: Second data array containing a units attribute.

Keyword Args (optional):

enforce_units: bool: Whether to stop program if ref and dev units do not match Default value: True

Returns:

units_match: bool

gcpy.data_unit_is_mol_per_mol(da)

Check if the units of an xarray DataArray are mol/mol based on a set list of unit strings mol/mol may be.

Args:

da: xarray DataArray: Data array containing a units attribute

Returns:

is_molmol: bool: Whether input units are mol/mol

gcpy.MW_AIR_g = 28.9644

gcpy._warning_format

gcpy._current_dir

gcpy._rgb_WhGrYlRd

gcpy.WhGrYlRd

gcpy.six_plot(subplot, all_zero, all_nan, plot_val, grid, ax, rowcol, title, comap, unit, extent, masked_data, other_all_nan, gridtype, vmins, vmaxs, use_cmap_RdBu, match_cbar, verbose, log_color_scale, pedge=np.full((1, 1), -1), pedge_ind=np.full((1, 1), -1), log_yaxis=False, xtick_positions=[], xticklabels=[], plot_type='single_level', ratio_log=False, proj=ccrs.PlateCarree(), ll_plot_func='imshow', **extra_plot_args)

Plotting function to be called from compare_single_level or compare_zonal_mean. Primarily exists to eliminate code redundancy in the prior listed functions and has not been tested separately.

Args:

subplot: str: Type of plot to create (ref, dev, absolute difference or fractional difference)
all_zero: bool: Set this flag to True if the data to be plotted consist only of zeros
all_nan: bool: Set this flag to True if the data to be plotted consist only of NaNs
plot_val: xarray DataArray: Single variable GEOS-Chem output values to plot
grid: dict: Dictionary mapping plot_val to plottable coordinates
ax: matplotlib axes: Axes object to plot information. Will create a new axes if none is passed.
rowcol: tuple: Subplot position in overall Figure
title: str: Title to print on axes
comap: matplotlib Colormap: Colormap for plotting data values
unit: str: Units of plotted data
extent: tuple (minlon, maxlon, minlat, maxlat): Describes minimum and maximum latitude and longitude of input data
masked_data: numpy array: Masked area for cubed-sphere plotting
other_all_nan: bool: Set this flag to True if plotting ref/dev and the other of ref/dev is all nan
gridtype: str: “ll” for lat/lon or “cs” for cubed-sphere
vmins: list of float: list of length 3 of minimum ref value, dev value, and absdiff value
vmaxs: list of float: list of length 3 of maximum ref value, dev value, and absdiff value
use_cmap_RdBu: bool: Set this flag to True to use a blue-white-red colormap
match_cbar: bool: Set this flag to True if you are plotting with the same colorbar for ref and dev
verbose: bool: Set this flag to True to enable informative printout.
log_color_scale: bool: Set this flag to True to enable log-scale colormapping

Keyword Args (optional):

pedge: numpy array: Edge pressures of grid cells in data to be plotted Default value: np.full((1,1), -1)
pedge_ind: numpy array: Indices where edge pressure values are within a given pressure range Default value: np.full((1,1), -1)
log_yaxis: bool: Set this flag to True to enable log scaling of pressure in zonal mean plots Default value: False
xtick_positions: list of float: Locations of lat/lon or lon ticks on plot Default value: []
xticklabels: list of str: Labels for lat/lon ticks Default value: []
plot_type: str: Type of plot, either “single_level” or “zonal”mean” Default value: “single_level”
ratio_log: bool: Set this flag to True to enable log scaling for ratio plots Default value: False
proj: cartopy projection: Projection for plotting data Default value: ccrs.PlateCarree()
ll_plot_func: str: Function to use for lat/lon single level plotting with possible values ‘imshow’ and ‘pcolormesh’. imshow is much faster but is slightly displaced when plotting from dateline to dateline and/or pole to pole. Default value: ‘imshow’
extra_plot_args: various: Any extra keyword arguments are passed through the plotting functions to be used in calls to pcolormesh() (CS) or imshow() (Lat/Lon).

gcpy.compare_single_level(refdata, refstr, devdata, devstr, varlist=None, ilev=0, itime=0, refmet=None, devmet=None, weightsdir='.', pdfname='', cmpres=None, match_cbar=True, normalize_by_area=False, enforce_units=True, convert_to_ugm3=False, flip_ref=False, flip_dev=False, use_cmap_RdBu=False, verbose=False, log_color_scale=False, extra_title_txt=None, extent=[-1000, -1000, -1000, -1000], n_job=-1, sigdiff_list=[], second_ref=None, second_dev=None, spcdb_dir=os.path.dirname(__file__), sg_ref_path='', sg_dev_path='', ll_plot_func='imshow', **extra_plot_args)

Create single-level 3x2 comparison map plots for variables common in two xarray Datasets. Optionally save to PDF.

Args:

refdata: xarray dataset: Dataset used as reference in comparison
refstr: str OR list of str: String description for reference data to be used in plots OR list containing [ref1str, ref2str] for diff-of-diffs plots
devdata: xarray dataset: Dataset used as development in comparison
devstr: str OR list of str: String description for development data to be used in plots OR list containing [dev1str, dev2str] for diff-of-diffs plots

Keyword Args (optional):

varlist: list of strings: List of xarray dataset variable names to make plots for Default value: None (will compare all common variables)
ilev: integer: Dataset level dimension index using 0-based system. Indexing is ambiguous when plotting differing vertical grids Default value: 0
itime: integer: Dataset time dimension index using 0-based system Default value: 0
refmet: xarray dataset: Dataset containing ref meteorology Default value: None
devmet: xarray dataset: Dataset containing dev meteorology Default value: None
weightsdir: str: Directory path for storing regridding weights Default value: None (will create/store weights in current directory)
pdfname: str: File path to save plots as PDF Default value: Empty string (will not create PDF)
cmpres: str: String description of grid resolution at which to compare datasets Default value: None (will compare at highest resolution of ref and dev)
match_cbar: bool: Set this flag to True if you wish to use the same colorbar bounds for the Ref and Dev plots. Default value: True
normalize_by_area: bool: Set this flag to True if you wish to normalize the Ref and Dev raw data by grid area. Input ref and dev datasets must include AREA variable in m2 if normalizing by area. Default value: False
enforce_units: bool: Set this flag to True to force an error if Ref and Dev variables have different units. Default value: True
convert_to_ugm3: bool: Whether to convert data units to ug/m3 for plotting. Default value: False
flip_ref: bool: Set this flag to True to flip the vertical dimension of 3D variables in the Ref dataset. Default value: False
flip_dev: bool: Set this flag to True to flip the vertical dimension of 3D variables in the Dev dataset. Default value: False
use_cmap_RdBu: bool: Set this flag to True to use a blue-white-red colormap for plotting the raw data in both the Ref and Dev datasets. Default value: False
verbose: bool: Set this flag to True to enable informative printout. Default value: False
log_color_scale: bool: Set this flag to True to plot data (not diffs) on a log color scale. Default value: False
extra_title_txt: str: Specifies extra text (e.g. a date string such as “Jan2016”) for the top-of-plot title. Default value: None
extent: list: Defines the extent of the region to be plotted in form [minlon, maxlon, minlat, maxlat]. Default value plots extent of input grids. Default value: [-1000, -1000, -1000, -1000]
n_job: int: Defines the number of simultaneous workers for parallel plotting. Set to 1 to disable parallel plotting. Value of -1 allows the application to decide. Default value: -1
sigdiff_list: list of str: Returns a list of all quantities having significant differences (where |max(fractional difference)| > 0.1). Default value: []
second_ref: xarray Dataset: A dataset of the same model type / grid as refdata, to be used in diff-of-diffs plotting. Default value: None
second_dev: xarray Dataset: A dataset of the same model type / grid as devdata, to be used in diff-of-diffs plotting. Default value: None
spcdb_dir: str: Directory containing species_database.yml file. Default value: Path of GCPy code repository
sg_ref_path: str: Path to NetCDF file containing stretched-grid info (in attributes) for the ref dataset Default value: ‘’ (will not be read in)
sg_dev_path: str: Path to NetCDF file containing stretched-grid info (in attributes) for the dev dataset Default value: ‘’ (will not be read in)
ll_plot_func: str: Function to use for lat/lon single level plotting with possible values ‘imshow’ and ‘pcolormesh’. imshow is much faster but is slightly displaced when plotting from dateline to dateline and/or pole to pole. Default value: ‘imshow’
extra_plot_args: various: Any extra keyword arguments are passed through the plotting functions to be used in calls to pcolormesh() (CS) or imshow() (Lat/Lon).

gcpy.compare_zonal_mean(refdata, refstr, devdata, devstr, varlist=None, itime=0, refmet=None, devmet=None, weightsdir='.', pdfname='', cmpres=None, match_cbar=True, pres_range=[0, 2000], normalize_by_area=False, enforce_units=True, convert_to_ugm3=False, flip_ref=False, flip_dev=False, use_cmap_RdBu=False, verbose=False, log_color_scale=False, log_yaxis=False, extra_title_txt=None, n_job=-1, sigdiff_list=[], second_ref=None, second_dev=None, spcdb_dir=os.path.dirname(__file__), sg_ref_path='', sg_dev_path='', ref_vert_params=[[], []], dev_vert_params=[[], []], **extra_plot_args)

Create single-level 3x2 comparison zonal-mean plots for variables common in two xarray Daatasets. Optionally save to PDF.

Args:

refdata: xarray dataset: Dataset used as reference in comparison
refstr: str OR list of str: String description for reference data to be used in plots OR list containing [ref1str, ref2str] for diff-of-diffs plots
devdata: xarray dataset: Dataset used as development in comparison
devstr: str OR list of str: String description for development data to be used in plots OR list containing [dev1str, dev2str] for diff-of-diffs plots

Keyword Args (optional):

varlist: list of strings: List of xarray dataset variable names to make plots for Default value: None (will compare all common 3D variables)
itime: integer: Dataset time dimension index using 0-based system Default value: 0
refmet: xarray dataset: Dataset containing ref meteorology Default value: None
devmet: xarray dataset: Dataset containing dev meteorology Default value: None
weightsdir: str: Directory path for storing regridding weights Default value: None (will create/store weights in current directory)
pdfname: str: File path to save plots as PDF Default value: Empty string (will not create PDF)
cmpres: str: String description of grid resolution at which to compare datasets Default value: None (will compare at highest resolution of Ref and Dev)
match_cbar: bool: Set this flag to True to use same the colorbar bounds for both Ref and Dev plots. Default value: True
pres_range: list of two integers: Pressure range of levels to plot [hPa]. The vertical axis will span the outer pressure edges of levels that contain pres_range endpoints. Default value: [0,2000]
normalize_by_area: bool: Set this flag to True to to normalize raw data in both Ref and Dev datasets by grid area. Input ref and dev datasets must include AREA variable in m2 if normalizing by area. Default value: False
enforce_units: bool: Set this flag to True force an error if the variables in the Ref and Dev datasets have different units. Default value: True
convert_to_ugm3: str: Whether to convert data units to ug/m3 for plotting. Default value: False
flip_ref: bool: Set this flag to True to flip the vertical dimension of 3D variables in the Ref dataset. Default value: False
flip_dev: bool: Set this flag to True to flip the vertical dimension of 3D variables in the Dev dataset. Default value: False
use_cmap_RdBu: bool: Set this flag to True to use a blue-white-red colormap for plotting raw reference and development datasets. Default value: False
verbose: logical: Set this flag to True to enable informative printout. Default value: False
log_color_scale: bool: Set this flag to True to enable plotting data (not diffs) on a log color scale. Default value: False
log_yaxis: bool: Set this flag to True if you wish to create zonal mean plots with a log-pressure Y-axis. Default value: False
extra_title_txt: str: Specifies extra text (e.g. a date string such as “Jan2016”) for the top-of-plot title. Default value: None
n_job: int: Defines the number of simultaneous workers for parallel plotting. Set to 1 to disable parallel plotting. Value of -1 allows the application to decide. Default value: -1
sigdiff_list: list of str: Returns a list of all quantities having significant differences (where |max(fractional difference)| > 0.1). Default value: []
second_ref: xarray Dataset: A dataset of the same model type / grid as refdata, to be used in diff-of-diffs plotting. Default value: None
second_dev: xarray Dataset: A dataset of the same model type / grid as devdata, to be used in diff-of-diffs plotting. Default value: None
spcdb_dir: str: Directory containing species_database.yml file. Default value: Path of GCPy code repository
sg_ref_path: str: Path to NetCDF file containing stretched-grid info (in attributes) for the ref dataset Default value: ‘’ (will not be read in)
sg_dev_path: str: Path to NetCDF file containing stretched-grid info (in attributes) for the dev dataset Default value: ‘’ (will not be read in)
ref_vert_params: list(AP, BP) of list-like types: Hybrid grid parameter A in hPa and B (unitless). Needed if ref grid is not 47 or 72 levels. Default value: [[], []]
dev_vert_params: list(AP, BP) of list-like types: Hybrid grid parameter A in hPa and B (unitless). Needed if dev grid is not 47 or 72 levels. Default value: [[], []]
extra_plot_args: various: Any extra keyword arguments are passed through the plotting functions to be used in calls to pcolormesh() (CS) or imshow() (Lat/Lon).

gcpy.normalize_colors(vmin, vmax, is_difference=False, log_color_scale=False, ratio_log=False)

Normalizes a data range to the colormap range used by matplotlib functions. For log-color scales, special handling is done to prevent taking the log of data that is all zeroes.

Args:

vmin: float: Minimum value of the data range.
vmax: float: Maximum value of the data range.

Keyword Args (optional):

is_difference: bool: Set this switch to denote that we are using a difference color scale (i.e. with zero in the middle of the range). Default value: False
log_color_scale: bool: Logical flag to denote that we are using a logarithmic color scale instead of a linear color scale. Default value: False

Returns:

norm: matplotlib Norm: The normalized matplotlib color range, stored in a matplotlib Norm object.

Remarks:

For log color scales, we will use a range of 3 orders of magnitude (i.e. from vmax/1e3 to vmax).

gcpy.single_panel(plot_vals, ax=None, plot_type='single_level', grid={}, gridtype='', title='fill', comap=WhGrYlRd, norm=[], unit='', extent=(None, None, None, None), masked_data=None, use_cmap_RdBu=False, log_color_scale=False, add_cb=True, pres_range=[0, 2000], pedge=np.full((1, 1), -1), pedge_ind=np.full((1, 1), -1), log_yaxis=False, xtick_positions=[], xticklabels=[], proj=ccrs.PlateCarree(), sg_path='', ll_plot_func='imshow', vert_params=[[], []], pdfname='', weightsdir='.', vmin=None, vmax=None, return_list_of_plots=False, **extra_plot_args)

Core plotting routine – creates a single plot panel.

Args:

plot_vals: xarray DataArray or numpy array: Single data variable GEOS-Chem output to plot

Keyword Args (Optional):

ax: matplotlib axes: Axes object to plot information Default value: None (Will create a new axes)
plot_type: str: Either “single_level” or “zonal_mean” Default value: “single_level”
grid: dict: Dictionary mapping plot_vals to plottable coordinates Default value: {} (will attempt to read grid from plot_vals)
gridtype: str: “ll” for lat/lon or “cs” for cubed-sphere Default value: “” (will automatically determine from grid)
title: str: Title to put at top of plot Default value: “fill” (will use name attribute of plot_vals if available)
comap: matplotlib Colormap: Colormap for plotting data values Default value: WhGrYlRd
norm: list: List with range [0..1] normalizing color range for matplotlib methods Default value: [] (will determine from plot_vals)
unit: str: Units of plotted data Default value: “” (will use units attribute of plot_vals if available)
extent: tuple (minlon, maxlon, minlat, maxlat): Describes minimum and maximum latitude and longitude of input data Default value: (None, None, None, None) (Will use full extent of plot_vals if plot is single level.
masked_data: numpy array: Masked area for avoiding near-dateline cubed-sphere plotting issues Default value: None (will attempt to determine from plot_vals)
use_cmap_RdBu: bool: Set this flag to True to use a blue-white-red colormap Default value: False
log_color_scale: bool: Set this flag to True to use a log-scale colormap Default value: False
add_cb: bool: Set this flag to True to add a colorbar to the plot Default value: True
pres_range: list(int): Range from minimum to maximum pressure for zonal mean plotting Default value: [0, 2000] (will plot entire atmosphere)
pedge: numpy array: Edge pressures of vertical grid cells in plot_vals for zonal mean plotting Default value: np.full((1, 1), -1) (will determine automatically)
pedge_ind: numpy array: Index of edge pressure values within pressure range in plot_vals for zonal mean plotting Default value: np.full((1, 1), -1) (will determine automatically)
log_yaxis: bool: Set this flag to True to enable log scaling of pressure in zonal mean plots Default value: False
xtick_positions: list(float): Locations of lat/lon or lon ticks on plot Default value: [] (will place automatically for zonal mean plots)
xticklabels: list(str): Labels for lat/lon ticks Default value: [] (will determine automatically from xtick_positions)
proj: cartopy projection: Projection for plotting data Default value: ccrs.PlateCarree()
sg_path: str: Path to NetCDF file containing stretched-grid info (in attributes) for plot_vals Default value: ‘’ (will not be read in)
ll_plot_func: str: Function to use for lat/lon single level plotting with possible values ‘imshow’ and ‘pcolormesh’. imshow is much faster but is slightly displaced when plotting from dateline to dateline and/or pole to pole. Default value: ‘imshow’
vert_params: list(AP, BP) of list-like types: Hybrid grid parameter A in hPa and B (unitless). Needed if grid is not 47 or 72 levels. Default value: [[], []]
pdfname: str: File path to save plots as PDF Default value: “” (will not create PDF)
weightsdir: str: Directory path for storing regridding weights Default value: “.” (will store regridding files in current directory) Default value: “” (will not create PDF)
vmin: float: minimum for colorbars Default value: None (will use plot value minimum)
vmax: float: maximum for colorbars Default value: None (will use plot value maximum)
return_list_of_plots: bool: Return plots as a list. This is helpful if you are using a cubedsphere grid and would like access to all 6 plots Default value: False
extra_plot_args: various: Any extra keyword arguments are passed to calls to pcolormesh() (CS) or imshow() (Lat/Lon).

Returns:

plot: matplotlib plot: Plot object created from input

gcpy.combine_dataset(file_list=None)

Wrapper for xarray.open_mfdataset, taking into account the extra arguments needed in xarray 0.15 and later.

Args:: file_list: list of str
Returns:: ds: xarray Dataset

gcpy.validate_metrics_collection(ds)

Determines if a Dataset contains variables for computing metrics from a CH4 simulation or a fullchem simulation.

Args:: ds: xarray Dataset
Returns:: is_ch4_sim: bool

gcpy.read_metrics_collection(files)

Reads data from all “Metrics” collection netCDF files into a single xarray Dataset.

Args:

data_dir: str: Directory containing data files. Default: “./OutputDir”.

Returns:

ds: xarray Dataset

gcpy.total_airmass(ds)

Computes the total airmass (in both kg and molec).

Args:

ds: xarray Dataset

Returns:

airmass_kg, airmass_m: numpy float64: Total atmospheric air mass in [kg] and [molec]

gcpy.global_mean_oh(ds, airmass_kg, mw_oh_kg)

Computes the global mean OH concentration (1e5 molec cm-3)

Args:: sum_airmass_kg: numpy float64 ds: xarray Dataset
Returns:: sum_mean_oh: numpy float64

gcpy.lifetimes_wrt_oh(ds, airmass_m)

Computes the lifetimes (in years) of CH4 and CH3CCl3 (aka MCF) against tropospheric OH.

Args:

ds: xarray Dataset

airmass_m: numpy float64: Total airmass [molecules]
s_per_yr: numpy float64: Conversion factor: seconds to year.

Returns:

ch4_life_wrt_oh, mcf_life_wrt_oh: numpy float64

gcpy.init_common_vars(ref, refstr, dev, devstr, spcdb_dir)

Returns a dictionary containing various quantities that need to be passed between methods.

Args:

ref: str: Path name of “Ref” (aka “Reference”) data set file.
refstr: str: A string to describe ref (e.g. version number)
dev: str: Path name of “Dev” (aka “Development”) data set file. The “Dev” data set will be compared against the “Ref” data set.
devstr: str: A string to describe dev (e.g. version number)
spcdb_dir: str: Directory of species_datbase.yml file Default value: Directory of GCPy code repository

Returns:

common_vars: dict

gcpy.compute_oh_metrics(common_vars)

Computes the mass-weighted mean OH concentration, CH3CCl3 (aka MCF) lifetime w/r/t OH, and CH4 lifetime w/r/t OH.

Args:: common_vars: dict
Returns:: common_vars: dict

gcpy.write_to_file(f, title, ref, dev, absdiff, pctdiff, is_mean_oh=False)

Internal routine used by print_metrics to write a specific quantity (mean OH, MCF lifetime, CH4 lifetime) to a file.

Args:

f: file

title: str

ref, dev, absdiff, pctdiff: numpy float64

is_mean_oh: bool

gcpy.print_metrics(common_vars, dst)

Prints the mass-weighted mean OH (full atmospheric column) from a GEOS-Chem simulation.

Args:: ds: xarray Dataset is_ch4_sim: bool

gcpy.make_benchmark_oh_metrics(ref, refstr, dev, devstr, dst='./benchmark', overwrite=True, spcdb_dir=os.path.dirname(__file__))

Creates a text file containing metrics of global mean OH, MCF lifetime, and CH4 lifetime for benchmarking purposes.

Args:

ref: str: Path name of “Ref” (aka “Reference”) data set file.
refstr: str: A string to describe ref (e.g. version number)
dev: str: Path name of “Dev” (aka “Development”) data set file. The “Dev” data set will be compared against the “Ref” data set.
devstr: str: A string to describe dev (e.g. version number)

Keyword Args (optional):

dst: str: A string denoting the destination folder where the file containing emissions totals will be written. Default value: ./benchmark
overwrite: bool: Set this flag to True to overwrite files in the destination folder (specified by the dst argument). Default value: False
spcdb_dir: str: Directory of species_datbase.yml file Default value: Directory of GCPy code repository

class gcpy._GlobVars(reflogdir, refstr, devlogdir, devstr, year, dst, overwrite): Private class _GlobVars contains global data that needs to be shared among the methods in this module.

gcpy.find_mean_oh(filename)

Searches a GEOS-Chem “Classic” log file for the Mean OH value.

Args:

filename: str: GEOS-Chem “Classic” log file.

gcpy.compute_mean_oh_from_logs(globvars)

Computes mean OH from GEOS-Chem FullChemBenchmark log files.

Args:

globvars: _GlobVars: Global variables

gcpy.print_mean_oh_from_logs(globvars, df)

Prints the mean OH table from 1-year FullChemBenchmark log files.

Args:

globvars: _GlobVars: Global variables
df: pandas DataFrame: Strat-trop exchange table

gcpy.make_benchmark_oh_from_logs(reflogdir, refstr, devlogdir, devstr, year, dst='./1yr_benchmark', overwrite=True)

Creates the table of mean OH concentrations, as obtained from log files.

Args:

reflogdir, devlogdir: str: Directory containing log files from Ref and Dev simulations.
refstr, devstr: str: String label for the Ref and Def simulations.
year: str: Year of the Ref and Dev benchmark simulations.

Keyword Args (optional):

dst: str: Folder in which the mean OH table will be printed.
overwrite: bool: If true, will overwrite the existing OH table in dst.

gcpy.maindir = /path/to/benchmark/main/dir

gcpy.get_shape_of_data(data, vertical_dim='lev', return_dims=False)

Convenience routine to return a the shape (and dimensions, if requested) of an xarray Dataset, or xarray DataArray. Can also also take as input a dictionary of sizes (i.e. {‘time’: 1, ‘lev’: 72, …} from an xarray Dataset or xarray Datarray object.

Args:

data: xarray Dataset, xarray DataArray, or dict: The data for which the size is requested.

Keyword Args (optional):

vertical_dim: str: Specify the vertical dimension that you wish to return: lev or ilev. Default value: ‘lev’
return_dims: bool: Set this switch to True if you also wish to return a list of dimensions in the same order as the tuple of dimension sizes. Default value: False

Returns:

shape: tuple of int: Tuple containing the sizes of each dimension of dr in order: (time, lev|ilev, nf, lat|YDim, lon|XDim).
dims: list of str: If return_dims is True, then dims will contain a list of dimension names in the same order as shape ([‘time’, ‘lev’, ‘lat’, ‘lon’] for GEOS-Chem “Classic”,

or [‘time’, ‘lev’, ‘nf’, ‘Ydim’, ‘Xdim’] for GCHP.

gcpy.scs_transform(x, y, s, tx, ty)

gcpy.R_EARTH_m = 6371007.2

gcpy.get_troposphere_mask(ds)

Returns a mask array for picking out the tropospheric grid boxes.

Args:

ds: xarray Dataset: Dataset containing certain met field variables (i.e. Met_TropLev, Met_BXHEIGHT).

Returns:

tropmask: numpy ndarray: Tropospheric mask. False denotes grid boxes that are in the troposphere and True in the stratosphere (as per Python masking logic).

gcpy.get_input_res(data)

Returns resolution of dataset passed to compare_single_level or compare_zonal_means

Args:

data: xarray Dataset: Input GEOS-Chem dataset

Returns:

res: str or int: Lat/lon res of the form ‘latresxlonres’ or cubed-sphere resolution
gridtype: str: ‘ll’ for lat/lon or ‘cs’ for cubed-sphere

gcpy.call_make_grid(res, gridtype, in_extent=[-180, 180, -90, 90], out_extent=[-180, 180, -90, 90], sg_params=[1, 170, -90])

Create a mask with NaN values removed from an input array

Args:

res: str or int: Resolution of grid (format ‘latxlon’ or csres)
gridtype: str: ‘ll’ for lat/lon or ‘cs’ for cubed-sphere

Keyword Args (optional):

in_extent: list[float, float, float, float]: Describes minimum and maximum latitude and longitude of input data in the format [minlon, maxlon, minlat, maxlat] Default value: [-180, 180, -90, 90]
out_extent: list[float, float, float, float]: Desired minimum and maximum latitude and longitude of output grid in the format [minlon, maxlon, minlat, maxlat] Default value: [-180, 180, -90, 90]
sg_params: list[float, float, float] (stretch_factor, target_longitude, target_latitude): Desired stretched-grid parameters in the format [stretch_factor, target_longitude, target_latitude]. Will trigger stretched-grid creation if not default values. Default value: [1, 170, -90] (no stretching)

Returns:

[grid, grid_list]: list(dict, list(dict)): Returns the created grid. grid_list is a list of grids if gridtype is ‘cs’, else it is None

gcpy.get_grid_extents(data, edges=True)

Get min and max lat and lon from an input GEOS-Chem xarray dataset or grid dict

Args:

data: xarray Dataset or dict: A GEOS-Chem dataset or a grid dict
edges (optional): bool: Whether grid extents should use cell edges instead of centers Default value: True

Returns:

minlon: float: Minimum longitude of data grid
maxlon: float: Maximum longitude of data grid
minlat: float: Minimum latitude of data grid
maxlat: float: Maximum latitude of data grid

gcpy.get_vert_grid(dataset, AP=[], BP=[])

Determine vertical grid of input dataset

Args:

dataset: xarray Dataset: A GEOS-Chem output dataset

Keyword Args (optional):

AP: list-like type: Hybrid grid parameter A in hPa Default value: []
BP: list-like type: Hybrid grid parameter B (unitless) Default value: []

Returns:

p_edge: numpy array: Edge pressure values for vertical grid
p_mid: numpy array: Midpoint pressure values for vertical grid
nlev: int: Number of levels in vertical grid

gcpy.get_pressure_indices(pedge, pres_range)

Get indices where edge pressure values are within a given pressure range

Args:

pedge: numpy array: A GEOS-Chem output dataset
pres_range: list(float, float): Contains minimum and maximum pressure

Returns:

numpy array: Indices where edge pressure values are within a given pressure range

gcpy.pad_pressure_edges(pedge_ind, max_ind, pmid_len)

Add outer indices to edge pressure index list

Args:

pedge_ind: list: List of edge pressure indices
max_ind: int: Maximum index
pmid_len: int: Length of pmid which should not be exceeded by indices

Returns:

pedge_ind: list: List of edge pressure indices, possibly with new minimum and maximum indices

gcpy.get_ind_of_pres(dataset, pres)

Get index of pressure level that contains the requested pressure value.

Args:

dataset: xarray Dataset: GEOS-Chem dataset
pres: int or float: Desired pressure value

Returns:

index: int: Index of level in dataset that corresponds to requested pressure

gcpy.convert_lev_to_pres(dataset, pmid, pedge, lev_type='pmid')

Convert lev dimension to pressure in a GEOS-Chem dataset

Args:

dataset: xarray Dataset: GEOS-Chem dataset
pmid: np.array: Midpoint pressure values
pedge: np.array: Edge pressure values
lev_type (optional): str: Denote whether lev is ‘pedge’ or ‘pmid’ if grid is not 72/73 or 47/48 levels Default value: ‘pmid’

Returns:

dataset: xarray Dataset: Input dataset with “lev” dimension values replaced with pressure values

class gcpy.vert_grid(AP=None, BP=None, p_sfc=1013.25)

p_edge()

p_mid()

gcpy._GEOS_72L_AP

gcpy._GEOS_72L_BP

gcpy.GEOS_72L_grid

gcpy._GEOS_47L_AP

gcpy._GEOS_47L_BP

gcpy._xmat_i

gcpy._xmat_j

gcpy._xmat_s

gcpy._x_lev

gcpy._skip_size_vec = [2, 4]

gcpy._number_lumped = [4, 7]

gcpy._i_lev = 36

gcpy._i_lev_72 = 36

gcpy._skip_size

gcpy._xmat_72to47

gcpy.GEOS_47L_grid

gcpy._CAM_26L_AP

gcpy._CAM_26L_BP

gcpy.CAM_26L_grid

gcpy.make_grid_LL(llres, in_extent=[-180, 180, -90, 90], out_extent=[])

Creates a lat/lon grid description.

Args:

llres: str: lat/lon resolution in ‘latxlon’ format (e.g. ‘4x5’)

Keyword Args (optional):

in_extent: list[float, float, float, float]: Describes minimum and maximum latitude and longitude of initial grid in the format [minlon, maxlon, minlat, maxlat] Default value: [-180, 180, -90, 90]
out_extent: list[float, float, float, float]: Describes minimum and maximum latitude and longitude of target grid in the format [minlon, maxlon, minlat, maxlat]. Needed when intending to use grid to trim extent of input data Default value: [] (assumes value of in_extent)

Returns:

llgrid: dict

dict grid description of format {‘lat’lat midpoints,: ‘lon’ : lon midpoints, ‘lat_b’ : lat edges, ‘lon_b’ : lon edges}

gcpy.make_grid_CS(csres)

Creates a cubed-sphere grid description.

Args:

csres: int: cubed-sphere resolution of target grid

Returns:

[csgrid, csgrid_list]: list[dict, list[dict]]

csgrid is a dict of format {‘lat’lat midpoints,: ‘lon’ : lon midpoints, ‘lat_b’ : lat edges, ‘lon_b’ : lon edges}

where each value has an extra face dimension of length 6. csgrid_list is a list of dicts separated by face index

gcpy.make_grid_SG(csres, stretch_factor, target_lon, target_lat)

Creates a stretched-grid grid description.

Args:

csres: int: cubed-sphere resolution of target grid
stretch_factor: float: stretch factor of target grid
target_lon: float: target stretching longitude of target grid
target_lon: float: target stretching latitude of target grid

Returns:

[csgrid, csgrid_list]: list[dict, list[dict]]

csgrid is a dict of format {‘lat’lat midpoints,: ‘lon’ : lon midpoints, ‘lat_b’ : lat edges, ‘lon_b’ : lon edges}

where each value has an extra face dimension of length 6. csgrid_list is a list of dicts separated by face index

gcpy.calc_rectilinear_lon_edge(lon_stride, center_at_180)

Compute longitude edge vector for a rectilinear grid. Parameters ———- lon_stride: float

Stride length in degrees. For example, for a standard GEOS-Chem Classic 4x5 grid, lon_stride would be 5.

center_at_180: bool: Whether or not the grid should have a cell center at 180 degrees (i.e. on the date line). If true, the first grid cell is centered on the date line; if false, the first grid edge is on the date line.

Returns

Longitudes of cell edges in degrees East. Notes —– All values are forced to be between [-180,180]. For a grid with N cells in each band, N+1 edges will be returned, with the first and last value being duplicates. Examples ——– >>> from gcpy.grid.horiz import calc_rectilinear_lon_edge >>> calc_rectilinear_lon_edge(5.0,true) np.array([177.5,-177.5,-172.5,…,177.5]) See Also ——– [NONE]

gcpy.calc_rectilinear_lat_edge(lat_stride, half_polar_grid)

Compute latitude edge vector for a rectilinear grid. Parameters ———- lat_stride: float

Stride length in degrees. For example, for a standard GEOS-Chem Classic 4x5 grid, lat_stride would be 4.

half_polar_grid: bool: Whether or not the grid should be “half-polar” (i.e. bands at poles are half the size). In either case the grid will start and end at -/+ 90, but when half_polar_grid is True, the first and last bands will have a width of 1/2 the normal lat_stride.

Returns

Latitudes of cell edges in degrees North. Notes —– All values are forced to be between [-90,90]. For a grid with N cells in each band, N+1 edges will be returned, with the first and last value being duplicates. Examples ——– >>> from gcpy.grid.horiz import calc_rectilinear_lat_edge >>> calc_rectilinear_lat_edge(4.0,true) np.array([-90,-88,-84,-80,…,84,88,90]) See Also ——– [NONE]

gcpy.calc_rectilinear_grid_area(lon_edge, lat_edge): Compute grid cell areas (in m2) for a rectilinear grid. Parameters ———- #TODO Returns ——- #TODO Notes —– #TODO Examples ——– #TODO See Also ——– [NONE]

gcpy.calc_delta_lon(lon_edge): Compute grid cell longitude widths from an edge vector. Parameters ———- lon_edge: float

Vector of longitude edges, in degrees East.

Returns

Width of each cell, degrees East Notes —– Accounts for looping over the domain. Examples ——– #TODO

gcpy.csgrid_GMAO(res, offset=-10): Return cubedsphere coordinates with GMAO face orientation Parameters ———- res: cubed-sphere Resolution This function was originally written by Jiawei Zhuange and included in package cubedsphere: https://github.com/JiaweiZhuang/cubedsphere

gcpy._INV_SQRT_3

gcpy._ASIN_INV_SQRT_3

class gcpy.CSGrid(c, offset=None)

Bases: object

Generator for cubed-sphere grid geometries. CSGrid computes the latitutde and longitudes of cell centers and edges on a cubed-sphere grid, providing a way to retrieve these geometries on-the-fly if your model output data does not include them. Attributes ———- {lon,lat}_center: np.ndarray

lat/lon coordinates for each cell center along the cubed-sphere mesh

{lon,lat}_edge: np.ndarray: lat/lon coordinates for the midpoint of the edges separating each element on the cubed-sphere mesh.
xyz_{center,edge}: np.ndarray: As above, except coordinates are projected into a 3D cartesian space with common origin to the original lat/lon coordinate system, assuming a unit sphere.

This class was originally written by Jiawei Zhuange and included in package cubedsphere: https://github.com/JiaweiZhuang/cubedsphere

_initialize()

gcpy.latlon_to_cartesian(lon, lat): Convert latitude/longitude coordinates along the unit sphere to cartesian coordinates defined by a vector pointing from the sphere’s center to its surface. This function was originally written by Jiawei Zhuange and included in package cubedsphere: https://github.com/JiaweiZhuang/cubedsphere

gcpy.vec_latlon_to_cartesian

gcpy.cartesian_to_latlon(x, y, z, ret_xyz=False): Convert a cartesian coordinate to latitude/longitude coordinates. Optionally return the original cartesian coordinate as a tuple. This function was originally written by Jiawei Zhuange and included in package cubedsphere: https://github.com/JiaweiZhuang/cubedsphere

gcpy.vec_cartesian_to_latlon

gcpy.spherical_to_cartesian(theta, phi, r=1): Convert spherical coordinates in the form (theta, phi[, r]) to cartesian, with the origin at the center of the original spherical coordinate system. This function was originally written by Jiawei Zhuange and included in package cubedsphere: https://github.com/JiaweiZhuang/cubedsphere

gcpy.vec_spherical_to_cartesian

gcpy.cartesian_to_spherical(x, y, z): Convert cartesian coordinates to spherical in the form (theta, phi[, r]) with the origin remaining at the center of the original spherical coordinate system. This function was originally written by Jiawei Zhuange and included in package cubedsphere: https://github.com/JiaweiZhuang/cubedsphere

gcpy.vec_cartesian_to_spherical

gcpy.rotate_sphere_3D(theta, phi, r, rot_ang, rot_axis='x'): Rotate a spherical coordinate in the form (theta, phi[, r]) about the indicating axis, ‘rot_axis’. This method accomplishes the rotation by projecting to a cartesian coordinate system and performing a solid body rotation around the requested axis. This function was originally written by Jiawei Zhuange and included in package cubedsphere: https://github.com/JiaweiZhuang/cubedsphere

gcpy.AVOGADRO = 6.022140857e+23

gcpy.BOLTZ = 1.38064852e-23

gcpy.G = 9.80665

gcpy.R_EARTH_m = 6371007.2

gcpy.R_EARTH_km = 6371.0072

gcpy.MW_AIR_g = 28.9644

gcpy.MW_AIR_kg = 0.0289644

gcpy.MW_H2O_kg = 0.018016

gcpy.RD = 287.0

gcpy.RSTARG = 8.3144598

gcpy.RV = 461.0

gcpy.skip_these_vars = ['anchor', 'ncontact', 'orientation', 'contacts', 'cubed_sphere']

gcpy.get_troposphere_mask(ds)

Returns a mask array for picking out the tropospheric grid boxes.

Args:

ds: xarray Dataset: Dataset containing certain met field variables (i.e. Met_TropLev, Met_BXHEIGHT).

Returns:

tropmask: numpy ndarray: Tropospheric mask. False denotes grid boxes that are in the troposphere and True in the stratosphere (as per Python masking logic).

gcpy.rename_and_flip_gchp_rst_vars(ds)

Transforms a GCHP restart dataset to match GCC names and level convention

Args:

ds: xarray Dataset: Dataset containing GCHP restart file data, such as variables SPC_{species}, BXHEIGHT, DELP_DRY, and TropLev, with level convention down (level 0 is top-of-atmosphere).

Returns:

ds: xarray Dataset: Dataset containing GCHP restart file data with names and level convention matching GCC restart. Variables include SpeciesRst_{species}, Met_BXHEIGHT, Met_DELPDRY, and Met_TropLev, with level convention up (level 0 is surface).

gcpy.dict_diff(dict0, dict1)

Function to take the difference of two dict objects. Assumes that both objects have the same keys.

Args:

dict0, dict1: dict: Dictionaries to be subtracted (dict1 - dict0)

Returns:

result: dict: Key-by-key difference of dict1 - dict0

gcpy.reshape_MAPL_CS(da)

Reshapes data if contains dimensions indicate MAPL v1.0.0+ output Args:

da: xarray DataArray
Data array variable

Returns:

data: xarray DataArray: Data with dimensions renamed and transposed to match old MAPL format

class gcpy._GlobVars(devstr, devdir, devrstdir, year, dst, is_gchp, overwrite, spcdb_dir): Private class _GlobVars contains global data that needs to be shared among the methods in this module.

gcpy.total(globvars, dict_list)

Function to take the difference of two dict objects. Assumes that all objects have the same keys.

Args:

globvars: obj of type _GlobVars: Global variables needed for budget computations.
dict_list: list of dict: Dictionaries to be summed.

Returns:

result: dict: Key-by-key sum of all dicts in dict_list.

gcpy.mass_from_rst(globvars, ds, tropmask)

Computes global species mass from a restart file.

Args:

globvars: obj of type _GlobVars: Global variables needed for budget computations.
ds: xarray Dataset: Data containing species mass to be summed.
tropmask: numpy ndarray: Mask to denote tropospheric grid boxes.

Returns:

result: dict: Species mass in strat, trop, and strat+trop regimes.

gcpy.annual_average(globvars, ds, collection, conv_factor)

Computes the annual average of budgets or fluxes.

Args:

globvars: obj of type _GlobVars
Global variables needed for budget computations.

ds: xarray Dataset
Data to be averaged

collection: str
Name of the diagnostic collection.

conv_factor: str
Conversion factor to be applied.

Returns:

result: dict: Annual-average budgets or fluxes in in strat, trop, and strat+trop regimes.

gcpy.annual_average_sources(globvars)

Computes the annual average of radionuclide sources.

Args:

globvars: obj of type _GlobVars
Global variables needed for budget computations.

Returns:

result: dict: Source totals in strat, trop, and strat+trop regimes.

gcpy.trop_residence_time(globvars)

Computes the tropospheric residence time of radionuclides.

Args:

globvars: obj of type _GlobVars: Global variables needed for budget computations.

Returns:

result: dict: Tropopsheric residence time for all species.

gcpy.print_budgets(globvars, data, key)

Prints the trop+strat budget file.

Args:

globvars: object of type _GlobVars: Global variables needed for budget computations.
data: dict: Nested dictionary containing budget info.
key: list of str: One of “_f”, (full-atmosphere) “_t” (trop-only), or “_s” (strat-only).

gcpy.transport_tracers_budgets(devstr, devdir, devrstdir, year, dst='./1yr_benchmark', is_gchp=False, overwrite=True, spcdb_dir=os.path.dirname(__file__))

Main program to compute TransportTracersBenchmark budgets

Args:

maindir: str: Top-level benchmark folder
devstr: str: Denotes the “Dev” benchmark version.
year: int: The year of the benchmark simulation (e.g. 2016).

Keyword Args (optional):

dst: str: Directory where budget tables will be created. Default value: ‘./1yr_benchmark’
is_gchp: bool: Denotes if data is from GCHP (True) or GCC (false). Default value: False
overwrite: bool: Denotes whether to ovewrite existing budget tables. Default value: True
spcdb_dir: str: Directory where species_database.yml is stored. Default value: GCPy directory

gcpy.compare_single_level(refdata, refstr, devdata, devstr, varlist=None, ilev=0, itime=0, refmet=None, devmet=None, weightsdir='.', pdfname='', cmpres=None, match_cbar=True, normalize_by_area=False, enforce_units=True, convert_to_ugm3=False, flip_ref=False, flip_dev=False, use_cmap_RdBu=False, verbose=False, log_color_scale=False, extra_title_txt=None, extent=[-1000, -1000, -1000, -1000], n_job=-1, sigdiff_list=[], second_ref=None, second_dev=None, spcdb_dir=os.path.dirname(__file__), sg_ref_path='', sg_dev_path='', ll_plot_func='imshow', **extra_plot_args)

Create single-level 3x2 comparison map plots for variables common in two xarray Datasets. Optionally save to PDF.

Args:

refdata: xarray dataset: Dataset used as reference in comparison
refstr: str OR list of str: String description for reference data to be used in plots OR list containing [ref1str, ref2str] for diff-of-diffs plots
devdata: xarray dataset: Dataset used as development in comparison
devstr: str OR list of str: String description for development data to be used in plots OR list containing [dev1str, dev2str] for diff-of-diffs plots

Keyword Args (optional):

varlist: list of strings: List of xarray dataset variable names to make plots for Default value: None (will compare all common variables)
ilev: integer: Dataset level dimension index using 0-based system. Indexing is ambiguous when plotting differing vertical grids Default value: 0
itime: integer: Dataset time dimension index using 0-based system Default value: 0
refmet: xarray dataset: Dataset containing ref meteorology Default value: None
devmet: xarray dataset: Dataset containing dev meteorology Default value: None
weightsdir: str: Directory path for storing regridding weights Default value: None (will create/store weights in current directory)
pdfname: str: File path to save plots as PDF Default value: Empty string (will not create PDF)
cmpres: str: String description of grid resolution at which to compare datasets Default value: None (will compare at highest resolution of ref and dev)
match_cbar: bool: Set this flag to True if you wish to use the same colorbar bounds for the Ref and Dev plots. Default value: True
normalize_by_area: bool: Set this flag to True if you wish to normalize the Ref and Dev raw data by grid area. Input ref and dev datasets must include AREA variable in m2 if normalizing by area. Default value: False
enforce_units: bool: Set this flag to True to force an error if Ref and Dev variables have different units. Default value: True
convert_to_ugm3: bool: Whether to convert data units to ug/m3 for plotting. Default value: False
flip_ref: bool: Set this flag to True to flip the vertical dimension of 3D variables in the Ref dataset. Default value: False
flip_dev: bool: Set this flag to True to flip the vertical dimension of 3D variables in the Dev dataset. Default value: False
use_cmap_RdBu: bool: Set this flag to True to use a blue-white-red colormap for plotting the raw data in both the Ref and Dev datasets. Default value: False
verbose: bool: Set this flag to True to enable informative printout. Default value: False
log_color_scale: bool: Set this flag to True to plot data (not diffs) on a log color scale. Default value: False
extra_title_txt: str: Specifies extra text (e.g. a date string such as “Jan2016”) for the top-of-plot title. Default value: None
extent: list: Defines the extent of the region to be plotted in form [minlon, maxlon, minlat, maxlat]. Default value plots extent of input grids. Default value: [-1000, -1000, -1000, -1000]
n_job: int: Defines the number of simultaneous workers for parallel plotting. Set to 1 to disable parallel plotting. Value of -1 allows the application to decide. Default value: -1
sigdiff_list: list of str: Returns a list of all quantities having significant differences (where |max(fractional difference)| > 0.1). Default value: []
second_ref: xarray Dataset: A dataset of the same model type / grid as refdata, to be used in diff-of-diffs plotting. Default value: None
second_dev: xarray Dataset: A dataset of the same model type / grid as devdata, to be used in diff-of-diffs plotting. Default value: None
spcdb_dir: str: Directory containing species_database.yml file. Default value: Path of GCPy code repository
sg_ref_path: str: Path to NetCDF file containing stretched-grid info (in attributes) for the ref dataset Default value: ‘’ (will not be read in)
sg_dev_path: str: Path to NetCDF file containing stretched-grid info (in attributes) for the dev dataset Default value: ‘’ (will not be read in)
ll_plot_func: str: Function to use for lat/lon single level plotting with possible values ‘imshow’ and ‘pcolormesh’. imshow is much faster but is slightly displaced when plotting from dateline to dateline and/or pole to pole. Default value: ‘imshow’
extra_plot_args: various: Any extra keyword arguments are passed through the plotting functions to be used in calls to pcolormesh() (CS) or imshow() (Lat/Lon).

gcpy.compare_zonal_mean(refdata, refstr, devdata, devstr, varlist=None, itime=0, refmet=None, devmet=None, weightsdir='.', pdfname='', cmpres=None, match_cbar=True, pres_range=[0, 2000], normalize_by_area=False, enforce_units=True, convert_to_ugm3=False, flip_ref=False, flip_dev=False, use_cmap_RdBu=False, verbose=False, log_color_scale=False, log_yaxis=False, extra_title_txt=None, n_job=-1, sigdiff_list=[], second_ref=None, second_dev=None, spcdb_dir=os.path.dirname(__file__), sg_ref_path='', sg_dev_path='', ref_vert_params=[[], []], dev_vert_params=[[], []], **extra_plot_args)

Create single-level 3x2 comparison zonal-mean plots for variables common in two xarray Daatasets. Optionally save to PDF.

Args:

refdata: xarray dataset: Dataset used as reference in comparison
refstr: str OR list of str: String description for reference data to be used in plots OR list containing [ref1str, ref2str] for diff-of-diffs plots
devdata: xarray dataset: Dataset used as development in comparison
devstr: str OR list of str: String description for development data to be used in plots OR list containing [dev1str, dev2str] for diff-of-diffs plots

Keyword Args (optional):

varlist: list of strings: List of xarray dataset variable names to make plots for Default value: None (will compare all common 3D variables)
itime: integer: Dataset time dimension index using 0-based system Default value: 0
refmet: xarray dataset: Dataset containing ref meteorology Default value: None
devmet: xarray dataset: Dataset containing dev meteorology Default value: None
weightsdir: str: Directory path for storing regridding weights Default value: None (will create/store weights in current directory)
pdfname: str: File path to save plots as PDF Default value: Empty string (will not create PDF)
cmpres: str: String description of grid resolution at which to compare datasets Default value: None (will compare at highest resolution of Ref and Dev)
match_cbar: bool: Set this flag to True to use same the colorbar bounds for both Ref and Dev plots. Default value: True
pres_range: list of two integers: Pressure range of levels to plot [hPa]. The vertical axis will span the outer pressure edges of levels that contain pres_range endpoints. Default value: [0,2000]
normalize_by_area: bool: Set this flag to True to to normalize raw data in both Ref and Dev datasets by grid area. Input ref and dev datasets must include AREA variable in m2 if normalizing by area. Default value: False
enforce_units: bool: Set this flag to True force an error if the variables in the Ref and Dev datasets have different units. Default value: True
convert_to_ugm3: str: Whether to convert data units to ug/m3 for plotting. Default value: False
flip_ref: bool: Set this flag to True to flip the vertical dimension of 3D variables in the Ref dataset. Default value: False
flip_dev: bool: Set this flag to True to flip the vertical dimension of 3D variables in the Dev dataset. Default value: False
use_cmap_RdBu: bool: Set this flag to True to use a blue-white-red colormap for plotting raw reference and development datasets. Default value: False
verbose: logical: Set this flag to True to enable informative printout. Default value: False
log_color_scale: bool: Set this flag to True to enable plotting data (not diffs) on a log color scale. Default value: False
log_yaxis: bool: Set this flag to True if you wish to create zonal mean plots with a log-pressure Y-axis. Default value: False
extra_title_txt: str: Specifies extra text (e.g. a date string such as “Jan2016”) for the top-of-plot title. Default value: None
n_job: int: Defines the number of simultaneous workers for parallel plotting. Set to 1 to disable parallel plotting. Value of -1 allows the application to decide. Default value: -1
sigdiff_list: list of str: Returns a list of all quantities having significant differences (where |max(fractional difference)| > 0.1). Default value: []
second_ref: xarray Dataset: A dataset of the same model type / grid as refdata, to be used in diff-of-diffs plotting. Default value: None
second_dev: xarray Dataset: A dataset of the same model type / grid as devdata, to be used in diff-of-diffs plotting. Default value: None
spcdb_dir: str: Directory containing species_database.yml file. Default value: Path of GCPy code repository
sg_ref_path: str: Path to NetCDF file containing stretched-grid info (in attributes) for the ref dataset Default value: ‘’ (will not be read in)
sg_dev_path: str: Path to NetCDF file containing stretched-grid info (in attributes) for the dev dataset Default value: ‘’ (will not be read in)
ref_vert_params: list(AP, BP) of list-like types: Hybrid grid parameter A in hPa and B (unitless). Needed if ref grid is not 47 or 72 levels. Default value: [[], []]
dev_vert_params: list(AP, BP) of list-like types: Hybrid grid parameter A in hPa and B (unitless). Needed if dev grid is not 47 or 72 levels. Default value: [[], []]
extra_plot_args: various: Any extra keyword arguments are passed through the plotting functions to be used in calls to pcolormesh() (CS) or imshow() (Lat/Lon).

gcpy.create_regridders(refds, devds, weightsdir='.', reuse_weights=True, cmpres=None, zm=False, sg_ref_params=[1, 170, -90], sg_dev_params=[1, 170, -90])

Internal function used for creating regridders between two datasets. Follows decision logic needed for plotting functions. Originally code from compare_single_level and compare_zonal_mean.

Args:

refds: xarray Dataset: Input dataset
devds: xarray Dataset: Output dataset

Keyword Args (optional):

weightsdir: str: Directory in which to create xESMF regridder NetCDF files Default value: ‘.’
reuse_weights: bool: Set this flag to True to reuse existing xESMF regridder NetCDF files Default value: False
cmpres: int or str: Specific target resolution for comparison grid used in difference and ratio plots Default value: None (will follow logic chain below)
zm: bool: Set this flag to True if regridders will be used in zonal mean plotting Default value: False
sg_ref_params: list[float, float, float] (stretch_factor, target_longitude, target_latitude): Ref grid stretched-grid parameters in the format [stretch_factor, target_longitude, target_latitude]. Default value: [1, 170, -90] (no stretching)
sg_dev_params: list[float, float, float] (stretch_factor, target_longitude, target_latitude): Dev grid stretched-grid parameters in the format [stretch_factor, target_longitude, target_latitude]. Default value: [1, 170, -90] (no stretching)

Returns:

list of many different quantities needed for regridding in plotting functions

refres, devres, cmpres: bool: Resolution of a dataset grid
refgridtype, devgridtype, cmpgridtype: str: Gridtype of a dataset (‘ll’ or ‘cs’)
regridref, regriddev, regridany: bool: Whether to regrid a dataset
refgrid, devgrid, cmpgrid: dict: Grid definition of a dataset
refregridder, devregridder: xESMF regridder: Regridder object between refgrid or devgrid and cmpgrid (will be None if input grid is not lat/lon)
refregridder_list, devregridder_list: list[6 xESMF regridders]: List of regridder objects for each face between refgrid or devgrid and cmpgrid (will be None if input grid is not cubed-sphere)

gcpy.get_troposphere_mask(ds)

Returns a mask array for picking out the tropospheric grid boxes.

Args:

ds: xarray Dataset: Dataset containing certain met field variables (i.e. Met_TropLev, Met_BXHEIGHT).

Returns:

tropmask: numpy ndarray: Tropospheric mask. False denotes grid boxes that are in the troposphere and True in the stratosphere (as per Python masking logic).

gcpy.convert_units(dr, species_name, species_properties, target_units, interval=[2678400.0], area_m2=None, delta_p=None, box_height=None)

Converts data stored in an xarray DataArray object from its native units to a target unit.

Args:

dr: xarray DataArray: Data to be converted from native units to target units.
species_name: str: Name of the species corresponding to the data stored in “dr”.
species_properties: dict: Dictionary containing species properties (e.g. molecular weights and other metadata) for the given species.
target_units: str: Units to which the data will be converted.

Keyword Args (optional):

interval: float: The length of the averaging period in seconds. Default value: [2678400.0]
area_m2: xarray DataArray: Surface area in square meters Default value: None
delta_p: xarray DataArray: Delta-pressure between top and bottom edges of grid box (dry air) in hPa Default value: None
box_height: xarray DataArray: Grid box height in meters Default value: None

Returns:

dr_new: xarray DataArray: Data converted to target units.

Remarks:

At present, only certain types of unit conversions have been implemented (corresponding to the most commonly used unit conversions for model benchmark output).

When molmol-1 is present as unit, assumes dry air.

gcpy.warning_format

gcpy.aod_spc = aod_species.yml

gcpy.spc_categories = benchmark_categories.yml

gcpy.emission_spc = emission_species.yml

gcpy.emission_inv = emission_inventories.yml

gcpy.create_total_emissions_table(refdata, refstr, devdata, devstr, species, outfilename, ref_interval=[2678400.0], dev_interval=[2678400.0], template='Emis{}_', refmetdata=None, devmetdata=None, spcdb_dir=os.path.dirname(__file__))

Creates a table of emissions totals (by sector and by inventory) for a list of species in contained in two data sets. The data sets, which typically represent output from two differnet model versions, are usually contained in netCDF data files.

Args:

refdata: xarray Dataset

The first data set to be compared (aka “Reference” or “Ref”).

refstr: str

A string that can be used to identify refdata (e.g. a model version number or other identifier).

devdata: xarray Dataset

The second data set to be compared (aka “Development” or “Dev”).

devstr: str

A string that can be used to identify the data set specified by devfile (e.g. a model version number or other identifier).

species: dict

Dictionary containing the name of each species and the target unit that emissions will be converted to. The format of species is as follows:

{ species_name: target_unit”, etc. }

where “species_name” and “target_unit” are strs.

outfilename: str

Name of the text file which will contain the table of emissions totals.

Keyword Args (optional):

ref_interval: float: The length of the ref data interval in seconds. By default, interval is set to the number of seconds in a 31-day month (86400 * 31), which corresponds to typical benchmark simulation output. Default value: [2678400.0]
dev_interval: float: The length of the dev data interval in seconds. By default, interval is set to the number of seconds in a 31-day month (86400 * 31), which corresponds to typical benchmark simulation output. Default value: [2678400.0]
template: str: Template for the diagnostic names that are contained both “Reference” and “Development” data sets. If not specified, template will be set to “Emis{}”, where {} will be replaced by the species name. Default value: “Emis{}_”
ref_area_varname: str: Name of the variable containing the grid box surface areas (in m2) in the ref dataset. Default value: ‘AREA’
dev_area_varname: str: Name of the variable containing the grid box surface areas (in m2) in the dev dataset. Default value: ‘AREA’
refmetdata: xarray dataset: Dataset containing ref meteorology and area Default value: None
devmetdata: xarray dataset: Dataset containing dev meteorology and area Default value: None
spcdb_dir: str: Directory of species_datbase.yml file Default value: Directory of GCPy code repository

Remarks:

This method is mainly intended for model benchmarking purposes, rather than as a general-purpose tool.

Species properties (such as molecular weights) are read from a YAML file called “species_database.yml”.

gcpy.create_global_mass_table(refdata, refstr, devdata, devstr, varlist, met_and_masks, label, trop_only=False, outfilename='GlobalMass_TropStrat.txt', verbose=False, spcdb_dir=os.path.dirname(__file__))

Creates a table of global masses for a list of species in contained in two data sets. The data sets, which typically represent output from two different model versions, are usually contained in netCDF data files.

Args:

refdata: xarray Dataset: The first data set to be compared (aka “Reference”).
refstr: str: A string that can be used to identify refdata (e.g. a model version number or other identifier).
devdata: xarray Dataset: The second data set to be compared (aka “Development”).
devstr: str: A string that can be used to identify the data set specified by devfile (e.g. a model version number or other identifier).
varlist: list of strings: List of species concentation variable names to include in the list of global totals.
met_and_masks: dict of xarray DataArray: Dictionary containing the meterological variables and masks for the Ref and Dev datasets.
label: str: Label to go in the header string. Can be used to pass the month & year.

Keyword Args (optional):

trop_only: bool: Set this switch to True if you wish to print totals only for the troposphere. Default value: False (i.e. print whole-atmosphere totals).
outfilename: str: Name of the text file which will contain the table of emissions totals. Default value: “GlobalMass_TropStrat.txt”
verbose: bool: Set this switch to True if you wish to print out extra informational messages. Default value: False
spcdb_dir: str: Directory of species_datbase.yml file Default value: Directory of GCPy code repository

Remarks:

This method is mainly intended for model benchmarking purposes, rather than as a general-purpose tool.

Species properties (such as molecular weights) are read from a YAML file called “species_database.yml”.

gcpy.make_benchmark_conc_plots(ref, refstr, dev, devstr, dst='./benchmark', subdst=None, overwrite=False, verbose=False, collection='SpeciesConc', benchmark_type='FullChemBenchmark', plot_by_spc_cat=True, restrict_cats=[], plots=['sfc', '500hpa', 'zonalmean'], use_cmap_RdBu=False, log_color_scale=False, sigdiff_files=None, normalize_by_area=False, cats_in_ugm3=['Aerosols', 'Secondary_Organic_Aerosols'], areas=None, refmet=None, devmet=None, weightsdir='.', n_job=-1, second_ref=None, second_dev=None, time_mean=False, spcdb_dir=os.path.dirname(__file__))

Creates PDF files containing plots of species concentration for model benchmarking purposes.

Args:

ref: str: Path name for the “Ref” (aka “Reference”) data set.
refstr: str OR list of str: A string to describe ref (e.g. version number) OR list containing [ref1str, ref2str] for diff-of-diffs plots
dev: str: Path name for the “Dev” (aka “Development”) data set. This data set will be compared against the “Reference” data set.
devstr: str OR list of str: A string to describe dev (e.g. version number) OR list containing [dev1str, dev2str] for diff-of-diffs plots

Keyword Args (optional):

dst: str

A string denoting the destination folder where a PDF file containing plots will be written. Default value: ./benchmark

subdst: str

A string denoting the sub-directory of dst where PDF files containing plots will be written. In practice, subdst is only needed for the 1-year benchmark output, and denotes a date string (such as “Jan2016”) that corresponds to the month that is being plotted. Default value: None

overwrite: bool

Set this flag to True to overwrite files in the destination folder (specified by the dst argument). Default value: False

verbose: bool

Set this flag to True to print extra informational output. Default value: False

collection: str
Name of collection to use for plotting. Default value: “SpeciesConc”

benchmark_type: str
A string denoting the type of benchmark output to plot, either FullChemBenchmark or TransportTracersBenchmark. Default value: “FullChemBenchmark”

plot_by_spc_cat: logical

Set this flag to False to send plots to one file rather than separate file per category. Default value: True

restrict_cats: list of strings

List of benchmark categories in benchmark_categories.yml to make plots for. If empty, plots are made for all categories. Default value: empty

plots: list of strings

List of plot types to create. Default value: [‘sfc’, ‘500hpa’, ‘zonalmean’]

log_color_scale: bool

Set this flag to True to enable plotting data (not diffs) on a log color scale. Default value: False

normalize_by_area: bool

Set this flag to true to enable normalization of data by surfacea area (i.e. kg s-1 –> kg s-1 m-2).

Default value: False

cats_in_ugm3: list of str

List of benchmark categories to to convert to ug/m3 Default value: [“Aerosols”, “Secondary_Organic_Aerosols”]

areas: dict of xarray DataArray:

Grid box surface areas in m2 on Ref and Dev grids. Default value: None

refmet: str

Path name for ref meteorology Default value: None

devmet: str

Path name for dev meteorology Default value: None

sigdiff_files: list of str

Filenames that will contain the lists of species having significant differences in the ‘sfc’, ‘500hpa’, and ‘zonalmean’ plots. These lists are needed in order to fill out the benchmark approval forms. Default value: None

weightsdir: str

Directory in which to place (and possibly reuse) xESMF regridder netCDF files. Default value: ‘.’

n_job: int

Defines the number of simultaneous workers for parallel plotting. Set to 1 to disable parallel plotting. Value of -1 allows the application to decide. Default value: -1

second_ref: str

Path name for a second “Ref” (aka “Reference”) data set for diff-of-diffs plotting. This dataset should have the same model type and grid as ref. Default value: None

second_dev: str

Path name for a second “Ref” (aka “Reference”) data set for diff-of-diffs plotting. This dataset should have the same model type and grid as ref. Default value: None

spcdb_dir: str

Directory of species_datbase.yml file Default value: Directory of GCPy code repository

time_meanbool

Determines if we should average the datasets over time Default value: False

gcpy.make_benchmark_emis_plots(ref, refstr, dev, devstr, dst='./benchmark', subdst=None, plot_by_spc_cat=False, plot_by_hco_cat=False, overwrite=False, verbose=False, flip_ref=False, flip_dev=False, log_color_scale=False, sigdiff_files=None, weightsdir='.', n_job=-1, time_mean=False, spcdb_dir=os.path.dirname(__file__))

Creates PDF files containing plots of emissions for model benchmarking purposes. This function is compatible with benchmark simulation output only. It is not compatible with transport tracers emissions diagnostics.

Args:

ref: str: Path name for the “Ref” (aka “Reference”) data set.
refstr: str: A string to describe ref (e.g. version number)
dev: str: Path name for the “Dev” (aka “Development”) data set. This data set will be compared against the “Reference” data set.
devstr: str: A string to describe dev (e.g. version number)

Keyword Args (optional):

dst: str

A string denoting the destination folder where PDF files containing plots will be written. Default value: ‘./benchmark

subdst: str

A string denoting the sub-directory of dst where PDF files containing plots will be written. In practice, and denotes a date string (such as “Jan2016”) that corresponds to the month that is being plotted. Default value: None

plot_by_spc_cat: bool

Set this flag to True to separate plots into PDF files according to the benchmark species categories (e.g. Oxidants, Aerosols, Nitrogen, etc.) These categories are specified in the YAML file benchmark_species.yml. Default value: False

plot_by_hco_cat: bool

Set this flag to True to separate plots into PDF files according to HEMCO emissions categories (e.g. Anthro, Aircraft, Bioburn, etc.) Default value: False

overwrite: bool

Set this flag to True to overwrite files in the destination folder (specified by the dst argument). Default value: False

verbose: bool

Set this flag to True to print extra informational output. Default value: False

flip_ref: bool

Set this flag to True to reverse the vertical level ordering in the “Ref” dataset (in case “Ref” starts from the top of atmosphere instead of the surface). Default value: False

flip_dev: bool

Set this flag to True to reverse the vertical level ordering in the “Dev” dataset (in case “Dev” starts from the top of atmosphere instead of the surface). Default value: False

log_color_scale: bool

Set this flag to True to enable plotting data (not diffs) on a log color scale. Default value: False

sigdiff_files: list of str: Filenames that will contain the lists of species having significant differences in the ‘sfc’, ‘500hpa’, and ‘zonalmean’ plots. These lists are needed in order to fill out the benchmark approval forms. Default value: None

weightsdir: str

Directory in which to place (and possibly reuse) xESMF regridder netCDF files. Default value: ‘.’

n_job: int

Defines the number of simultaneous workers for parallel plotting. Set to 1 to disable parallel plotting. Value of -1 allows the application to decide. Default value: -1

spcdb_dir: str

Directory of species_datbase.yml file Default value: Directory of GCPy code repository

time_meanbool

Determines if we should average the datasets over time Default value: False

Remarks:

If both plot_by_spc_cat and plot_by_hco_cat are False, then all emission plots will be placed into the same PDF file.
Emissions that are 3-dimensional will be plotted as column sums.

gcpy.make_benchmark_emis_tables(reflist, refstr, devlist, devstr, dst='./benchmark', refmet=None, devmet=None, overwrite=False, ref_interval=[2678400.0], dev_interval=[2678400.0], spcdb_dir=os.path.dirname(__file__))

Creates a text file containing emission totals by species and category for benchmarking purposes.

Args:

reflist: list of str: List with the path names of the emissions file or files (multiple months) that will constitute the “Ref” (aka “Reference”) data set.
refstr: str: A string to describe ref (e.g. version number)
devlist: list of str: List with the path names of the emissions file or files (multiple months) that will constitute the “Dev” (aka “Development”) data set
devstr: str: A string to describe dev (e.g. version number)

Keyword Args (optional):

dst: str: A string denoting the destination folder where the file containing emissions totals will be written. Default value: ./benchmark
refmet: str: Path name for ref meteorology Default value: None
devmet: str: Path name for dev meteorology Default value: None
overwrite: bool: Set this flag to True to overwrite files in the destination folder (specified by the dst argument). Default value: False
ref_interval: list of float: The length of the ref data interval in seconds. By default, interval is set to [2678400.0], which is the number of seconds in July (our 1-month benchmarking month). Default value: [2678400.0]
dev_interval: list of float: The length of the dev data interval in seconds. By default, interval is set to [2678400.0], which is the number of seconds in July (our 1-month benchmarking month). Default value: [2678400.0]
spcdb_dir: str: Directory of species_datbase.yml file Default value: Directory of GCPy code repository

gcpy.make_benchmark_jvalue_plots(ref, refstr, dev, devstr, varlist=None, dst='./benchmark', subdst=None, local_noon_jvalues=False, plots=['sfc', '500hpa', 'zonalmean'], overwrite=False, verbose=False, flip_ref=False, flip_dev=False, log_color_scale=False, sigdiff_files=None, weightsdir='.', n_job=-1, time_mean=False, spcdb_dir=os.path.dirname(__file__))

Creates PDF files containing plots of J-values for model benchmarking purposes.

Args:

ref: str: Path name for the “Ref” (aka “Reference”) data set.
refstr: str: A string to describe ref (e.g. version number)
dev: str: Path name for the “Dev” (aka “Development”) data set. This data set will be compared against the “Reference” data set.
devstr: str: A string to describe dev (e.g. version number)

Keyword Args (optional):

varlist: list of str: List of J-value variables to plot. If not passed, then all J-value variables common to both dev and ref will be plotted. The varlist argument can be a useful way of restricting the number of variables plotted to the pdf file when debugging. Default value: None
dst: str: A string denoting the destination folder where a PDF file containing plots will be written. Default value: ./benchmark.
subdst: str: A string denoting the sub-directory of dst where PDF files containing plots will be written. In practice, subdst is only needed for the 1-year benchmark output, and denotes a date string (such as “Jan2016”) that corresponds to the month that is being plotted. Default value: None
local_noon_jvalues: bool: Set this flag to plot local noon J-values. This will divide all J-value variables by the JNoonFrac counter, which is the fraction of the time that it was local noon at each location. Default value: False
plots: list of strings: List of plot types to create. Default value: [‘sfc’, ‘500hpa’, ‘zonalmean’]
overwrite: bool: Set this flag to True to overwrite files in the destination folder (specified by the dst argument). Default value: False.
verbose: bool: Set this flag to True to print extra informational output. Default value: False
flip_ref: bool: Set this flag to True to reverse the vertical level ordering in the “Ref” dataset (in case “Ref” starts from the top of atmosphere instead of the surface). Default value: False
flip_dev: bool: Set this flag to True to reverse the vertical level ordering in the “Dev” dataset (in case “Dev” starts from the top of atmosphere instead of the surface). Default value: False
log_color_scale: bool: Set this flag to True if you wish to enable plotting data (not diffs) on a log color scale. Default value: False
sigdiff_files: list of str: Filenames that will contain the lists of J-values having significant differences in the ‘sfc’, ‘500hpa’, and ‘zonalmean’ plots. These lists are needed in order to fill out the benchmark approval forms. Default value: None
weightsdir: str: Directory in which to place (and possibly reuse) xESMF regridder netCDF files. Default value: ‘.’
n_job: int: Defines the number of simultaneous workers for parallel plotting. Set to 1 to disable parallel plotting. Value of -1 allows the application to decide. Default value: -1
spcdb_dir: str: Directory of species_datbase.yml file Default value: Directory of GCPy code repository
time_meanbool: Determines if we should average the datasets over time Default value: False

Remarks:

Will create 4 files containing J-value plots:: (1 ) Surface values (2 ) 500 hPa values (3a) Full-column zonal mean values. (3b) Stratospheric zonal mean values

These can be toggled on/off with the plots keyword argument.

At present, we do not yet have the capability to split the plots up into separate files per category (e.g. Oxidants, Aerosols, etc.). This is primarily due to the fact that we archive J-values from GEOS-Chem for individual species but not family species. We could attempt to add this functionality later if there is sufficient demand.

gcpy.make_benchmark_aod_plots(ref, refstr, dev, devstr, varlist=None, dst='./benchmark', subdst=None, overwrite=False, verbose=False, log_color_scale=False, sigdiff_files=None, weightsdir='.', n_job=-1, time_mean=False, spcdb_dir=os.path.dirname(__file__))

Creates PDF files containing plots of column aerosol optical depths (AODs) for model benchmarking purposes.

Args:

ref: str: Path name for the “Ref” (aka “Reference”) data set.
refstr: str: A string to describe ref (e.g. version number)
dev: str: Path name for the “Dev” (aka “Development”) data set. This data set will be compared against the “Reference” data set.
devstr: str: A string to describe dev (e.g. version number)

Keyword Args (optional):

varlist: list of str: List of AOD variables to plot. If not passed, then all AOD variables common to both Dev and Ref will be plotted. Use the varlist argument to restrict the number of variables plotted to the pdf file when debugging. Default value: None
dst: str: A string denoting the destination folder where a PDF file containing plots will be written. Default value: ./benchmark.
subdst: str: A string denoting the sub-directory of dst where PDF files containing plots will be written. In practice, subdst is only needed for the 1-year benchmark output, and denotes a date string (such as “Jan2016”) that corresponds to the month that is being plotted. Default value: None
overwrite: bool: Set this flag to True to overwrite files in the destination folder (specified by the dst argument). Default value: False.
verbose: bool: Set this flag to True to print extra informational output. Default value: False
log_color_scale: bool: Set this flag to True to enable plotting data (not diffs) on a log color scale. Default value: False
sigdiff_files: list of str: Filenames that will contain the list of quantities having having significant differences in the column AOD plots. These lists are needed in order to fill out the benchmark approval forms. Default value: None
weightsdir: str: Directory in which to place (and possibly reuse) xESMF regridder netCDF files. Default value: ‘.’
n_job: int: Defines the number of simultaneous workers for parallel plotting. Set to 1 to disable parallel plotting. Value of -1 allows the application to decide. Default value: -1
spcdb_dir: str: Directory of species_datbase.yml file Default value: Directory of GCPy code repository
time_meanbool: Determines if we should average the datasets over time Default value: False

gcpy.make_benchmark_mass_tables(ref, refstr, dev, devstr, varlist=None, dst='./benchmark', subdst=None, overwrite=False, verbose=False, label='at end of simulation', spcdb_dir=os.path.dirname(__file__), ref_met_extra='', dev_met_extra='')

Creates a text file containing global mass totals by species and category for benchmarking purposes.

Args:

reflist: str: Pathname that will constitute the “Ref” (aka “Reference”) data set.
refstr: str: A string to describe ref (e.g. version number)
dev: list of str: Pathname that will constitute the “Dev” (aka “Development”) data set. The “Dev” data set will be compared against the “Ref” data set.
devstr: str: A string to describe dev (e.g. version number)

Keyword Args (optional):

varlist: list of str: List of variables to include in the list of totals. If omitted, then all variables that are found in either “Ref” or “Dev” will be included. The varlist argument can be a useful way of reducing the number of variables during debugging and testing. Default value: None
dst: str: A string denoting the destination folder where the file containing emissions totals will be written. Default value: ./benchmark
subdst: str: A string denoting the sub-directory of dst where PDF files containing plots will be written. In practice, subdst is only needed for the 1-year benchmark output, and denotes a date string (such as “Jan2016”) that corresponds to the month that is being plotted. Default value: None
overwrite: bool: Set this flag to True to overwrite files in the destination folder (specified by the dst argument). Default value: False
verbose: bool: Set this flag to True to print extra informational output. Default value: False.
spcdb_dir: str: Directory of species_datbase.yml file Default value: Directory of GCPy code repository
ref_met_extra: str: Path to ref Met file containing area data for use with restart files which do not contain the Area variable. Default value: ‘’
dev_met_extra: str: Path to dev Met file containing area data for use with restart files which do not contain the Area variable. Default value: ‘’

gcpy.make_benchmark_oh_metrics(ref, refmet, refstr, dev, devmet, devstr, dst='./benchmark', overwrite=False)

Creates a text file containing metrics of global mean OH, MCF lifetime, and CH4 lifetime for benchmarking purposes.

Args:

ref: str: Path name of “Ref” (aka “Reference”) data set file.
refmet: str: Path name of ref meteorology data set.
refstr: str: A string to describe ref (e.g. version number)
dev: str: Path name of “Dev” (aka “Development”) data set file. The “Dev” data set will be compared against the “Ref” data set.
devmet: list of str: Path name of dev meteorology data set.
devstr: str: A string to describe dev (e.g. version number)

Keyword Args (optional):

dst: str: A string denoting the destination folder where the file containing emissions totals will be written. Default value: “./benchmark”
overwrite: bool: Set this flag to True to overwrite files in the destination folder (specified by the dst argument). Default value: False

gcpy.make_benchmark_wetdep_plots(ref, refstr, dev, devstr, collection, dst='./benchmark', datestr=None, overwrite=False, verbose=False, benchmark_type='TransportTracersBenchmark', plots=['sfc', '500hpa', 'zonalmean'], log_color_scale=False, normalize_by_area=False, areas=None, refmet=None, devmet=None, weightsdir='.', n_job=-1, time_mean=False, spcdb_dir=os.path.dirname(__file__))

Creates PDF files containing plots of species concentration for model benchmarking purposes.

Args:

ref: str: Path name for the “Ref” (aka “Reference”) data set.
refstr: str: A string to describe ref (e.g. version number)
dev: str: Path name for the “Dev” (aka “Development”) data set. This data set will be compared against the “Reference” data set.
devstr: str: A string to describe dev (e.g. version number)
collection: str: String name of collection to plot comparisons for.

Keyword Args (optional):

dst: str: A string denoting the destination folder where a PDF file containing plots will be written. Default value: ./benchmark
datestr: str: A string with date information to be included in both the plot pdf filename and as a destination folder subdirectory for writing plots Default value: None
benchmark_type: str: A string denoting the type of benchmark output to plot, either FullChemBenchmark or TransportTracersBenchmark. Default value: “FullChemBenchmark”
overwrite: bool: Set this flag to True to overwrite files in the destination folder (specified by the dst argument). Default value: False.
verbose: bool: Set this flag to True to print extra informational output. Default value: False.
plots: list of strings: List of plot types to create. Default value: [‘sfc’, ‘500hpa’, ‘zonalmean’]
normalize_by_area: bool: Set this flag to true to enable normalization of data by surfacea area (i.e. kg s-1 –> kg s-1 m-2). Default value: False
areas: dict of xarray DataArray:: Grid box surface areas in m2 on Ref and Dev grids. Default value: None
refmet: str: Path name for ref meteorology Default value: None
devmet: str: Path name for dev meteorology Default value: None
n_job: int: Defines the number of simultaneous workers for parallel plotting. Set to 1 to disable parallel plotting. Value of -1 allows the application to decide. Default value: -1
spcdb_dir: str: Directory of species_datbase.yml file Default value: Directory of GCPy code repository
time_meanbool: Determines if we should average the datasets over time Default value: False

gcpy.make_benchmark_aerosol_tables(devdir, devlist_aero, devlist_spc, devlist_met, devstr, year, days_per_mon, dst='./benchmark', overwrite=False, is_gchp=False, spcdb_dir=os.path.dirname(__file__))

Compute FullChemBenchmark aerosol budgets & burdens

Args:

devdir: str: Path to development (“Dev”) data directory
devlist_aero: list of str: List of Aerosols collection files (different months)
devlist_spc: list of str: List of SpeciesConc collection files (different months)
devlist_met: list of str: List of meteorology collection files (different months)
devstr: str: Descriptive string for datasets (e.g. version number)
year: str: The year of the benchmark simulation (e.g. ‘2016’).
days_per_month: list of int: List of number of days per month for all months

Keyword Args (optional):

dst: str: Directory where budget tables will be created. Default value: ‘./benchmark’
overwrite: bool: Overwrite burden & budget tables? (default=True) Default value: False
is_gchp: bool: Whether datasets are for GCHP Default value: False
spcdb_dir: str: Directory of species_datbase.yml file Default value: Directory of GCPy code repository

gcpy.make_benchmark_operations_budget(refstr, reffiles, devstr, devfiles, ref_interval, dev_interval, benchmark_type=None, label=None, col_sections=['Full', 'Trop', 'PBL', 'Strat'], operations=['Chemistry', 'Convection', 'EmisDryDep', 'Mixing', 'Transport', 'WetDep'], compute_accum=True, require_overlap=False, dst='.', species=None, overwrite=True)

Prints the “operations budget” (i.e. change in mass after each operation) from a GEOS-Chem benchmark simulation.

Args:

refstr: str: Labels denoting the “Ref” versions
reffiles: list of str: Lists of files to read from the “Ref” version.
devstr: str: Labels denoting the “Dev” versions
devfiles: list of str: Lists of files to read from “Dev” version.
interval: float: Number of seconds in the diagnostic interval.

Keyword Args (optional):

benchmark_type: str: “TransportTracersBenchmark” or “FullChemBenchmark”. Default value: None
label: str: Contains the date or date range for each dataframe title. Default value: None
col_sections: list of str: List of column sections to calculate global budgets for. May include Strat eventhough not calculated in GEOS-Chem, but Full and Trop must also be present to calculate Strat. Default value: [“Full”, “Trop”, “PBL”, “Strat”]
operations: list of str: List of operations to calculate global budgets for. Accumulation should not be included. It will automatically be calculated if all GEOS-Chem budget operations are passed and optional arg compute_accum is True. Default value: [“Chemistry”,”Convection”,”EmisDryDep”,

“Mixing”,”Transport”,”WetDep”]
compute_accum: bool: Optionally turn on/off accumulation calculation. If True, will only compute accumulation if all six GEOS-Chem operations budgets are computed. Otherwise a message will be printed warning that accumulation will not be calculated. Default value: True
require_overlap: bool: Whether to calculate budgets for only species that are present in both Ref or Dev. Default value: False
dst: str: Directory where plots & tables will be created. Default value: ‘.’ (directory in which function is called)
species: list of str: List of species for which budgets will be created. Default value: None (all species)
overwrite: bool: Denotes whether to overwrite existing budget file. Default value: True

gcpy.make_benchmark_mass_conservation_table(datafiles, runstr, dst='./benchmark', overwrite=False, spcdb_dir=os.path.dirname(__file__))

Creates a text file containing global mass of the PassiveTracer from Transport Tracer simulations across a series of restart files.

Args:

datafiles: list of str: Path names of restart files.
runstr: str: Name to put in the filename and header of the output file
refstr: str: A string to describe ref (e.g. version number)
dev: str: Path name of “Dev” (aka “Development”) data set file. The “Dev” data set will be compared against the “Ref” data set.
devmet: list of str: Path name of dev meteorology data set.
devstr: str: A string to describe dev (e.g. version number)

Keyword Args (optional):

dst: str: A string denoting the destination folder where the file containing emissions totals will be written. Default value: “./benchmark”
overwrite: bool: Set this flag to True to overwrite files in the destination folder (specified by the dst argument). Default value: False

gcpy.get_input_res(data)

Returns resolution of dataset passed to compare_single_level or compare_zonal_means

Args:

data: xarray Dataset: Input GEOS-Chem dataset

Returns:

res: str or int: Lat/lon res of the form ‘latresxlonres’ or cubed-sphere resolution
gridtype: str: ‘ll’ for lat/lon or ‘cs’ for cubed-sphere

gcpy.get_vert_grid(dataset, AP=[], BP=[])

Determine vertical grid of input dataset

Args:

dataset: xarray Dataset: A GEOS-Chem output dataset

Keyword Args (optional):

AP: list-like type: Hybrid grid parameter A in hPa Default value: []
BP: list-like type: Hybrid grid parameter B (unitless) Default value: []

Returns:

p_edge: numpy array: Edge pressure values for vertical grid
p_mid: numpy array: Midpoint pressure values for vertical grid
nlev: int: Number of levels in vertical grid

gcpy.get_grid_extents(data, edges=True)

Get min and max lat and lon from an input GEOS-Chem xarray dataset or grid dict

Args:

data: xarray Dataset or dict: A GEOS-Chem dataset or a grid dict
edges (optional): bool: Whether grid extents should use cell edges instead of centers Default value: True

Returns:

minlon: float: Minimum longitude of data grid
maxlon: float: Maximum longitude of data grid
minlat: float: Minimum latitude of data grid
maxlat: float: Maximum latitude of data grid

gcpy.make_regridder_S2S(csres_in, csres_out, sf_in=1, tlon_in=170, tlat_in=-90, sf_out=1, tlon_out=170, tlat_out=-90, weightsdir='.', verbose=True)

Create an xESMF regridder from a cubed-sphere / stretched-grid grid to another cubed-sphere / stretched-grid grid. Stretched-grid params of 1, 170, -90 indicate no stretching.

Args:

csres_in: int: Cubed-sphere resolution of input grid
csres_out: int: Cubed-sphere resolution of output grid

Keyword Args (optional):

sf_in: float: Stretched-grid factor of input grid Default value: 1
tlon_in: float: Target longitude for stretching in input grid Default value: 170
tlat_in: float: Target longitude for stretching in input grid Default value: -90
sf_out: float: Stretched-grid factor of output grid Default value: 1
tlon_out: float: Target longitude for stretchingg in output grid Default value: 170
tlat_out: float: Target longitude for stretching in output grid Default value: -90
weightsdir: str: Directory in which to create xESMF regridder NetCDF files Default value: ‘.’
verbose: bool: Set this flag to True to enable printing when output faces do not intersect input faces when regridding Default value: True

Returns:

regridder_list: list[6 xESMF regridders]: list of regridder objects (one per cubed-sphere face) between the two specified grids

gcpy.reformat_dims(ds, format, towards_common)

Reformat dimensions of a cubed-sphere / stretched-grid grid between different GCHP formats

Args:

ds: xarray Dataset: Dataset to be reformatted
format: str: Format from or to which to reformat (‘checkpoint’ or ‘diagnostic’)
towards_common: bool: Set this flag to True to move towards a common dimension format

Returns:

ds: xarray Dataset: Original dataset with reformatted dimensions

gcpy.make_regridder_L2S(llres_in, csres_out, weightsdir='.', reuse_weights=True, sg_params=[1, 170, -90])

Create an xESMF regridder from a lat/lon to a cubed-sphere grid

Args:

llres_in: str: Resolution of input grid in format ‘latxlon’, e.g. ‘4x5’
csres_out: int: Cubed-sphere resolution of output grid

Keyword Args (optional):

weightsdir: str: Directory in which to create xESMF regridder NetCDF files Default value: ‘.’
reuse_weights: bool: Set this flag to True to reuse existing xESMF regridder NetCDF files Default value: False
sg_params: list[float, float, float] (stretch_factor, target_longitude, target_latitude): Output grid stretched-grid parameters in the format [stretch_factor, target_longitude, target_latitude]. Will trigger stretched-grid creation if not default values. Default value: [1, 170, -90] (no stretching)

Returns:

regridder_list: list[6 xESMF regridders]: list of regridder objects (one per cubed-sphere face) between the two specified grids

gcpy.make_regridder_C2L(csres_in, llres_out, weightsdir='.', reuse_weights=True, sg_params=[1, 170, -90])

Create an xESMF regridder from a cubed-sphere to lat/lon grid

Args:

csres_in: int: Cubed-sphere resolution of input grid
llres_out: str: Resolution of output grid in format ‘latxlon’, e.g. ‘4x5’

Keyword Args (optional):

weightsdir: str: Directory in which to create xESMF regridder NetCDF files Default value: ‘.’
reuse_weights: bool: Set this flag to True to reuse existing xESMF regridder NetCDF files Default value: False
sg_params: list[float, float, float] (stretch_factor, target_longitude, target_latitude): Input grid stretched-grid parameters in the format [stretch_factor, target_longitude, target_latitude]. Will trigger stretched-grid creation if not default values. Default value: [1, 170, -90] (no stretching)

Returns:

regridder_list: list[6 xESMF regridders]: list of regridder objects (one per cubed-sphere face) between the two specified grids

gcpy.make_regridder_L2L(llres_in, llres_out, weightsdir='.', reuse_weights=False, in_extent=[-180, 180, -90, 90], out_extent=[-180, 180, -90, 90])

Create an xESMF regridder between two lat/lon grids

Args:

llres_in: str: Resolution of input grid in format ‘latxlon’, e.g. ‘4x5’
llres_out: str: Resolution of output grid in format ‘latxlon’, e.g. ‘4x5’

Keyword Args (optional):

weightsdir: str: Directory in which to create xESMF regridder NetCDF files Default value: ‘.’
reuse_weights: bool: Set this flag to True to reuse existing xESMF regridder NetCDF files Default value: False
in_extent: list[float, float, float, float]: Describes minimum and maximum latitude and longitude of input grid in the format [minlon, maxlon, minlat, maxlat] Default value: [-180, 180, -90, 90]
out_extent: list[float, float, float, float]: Desired minimum and maximum latitude and longitude of output grid in the format [minlon, maxlon, minlat, maxlat] Default value: [-180, 180, -90, 90]

Returns:

regridder: xESMF regridder: regridder object between the two specified grids

gcpy.reshape_MAPL_CS(da)

Reshapes data if contains dimensions indicate MAPL v1.0.0+ output Args:

da: xarray DataArray
Data array variable

Returns:

data: xarray DataArray: Data with dimensions renamed and transposed to match old MAPL format

gcpy.file_regrid(fin, fout, dim_format_in, dim_format_out, cs_res_out=0, ll_res_out='0x0', sg_params_in=[1.0, 170.0, -90.0], sg_params_out=[1.0, 170.0, -90.0], vert_params_out=[[], []])

Regrids an input file to a new horizontal grid specification and saves it as a new file.

Args:

fin: str: The input filename
fout: str: The output filename (file will be overwritten if it already exists)
dim_format_in: str: Format of the input file’s dimensions (choose from: classic, checkpoint, diagnostic), where classic denotes lat/lon and checkpoint / diagnostic are cubed-sphere formats
dim_format_out: str: Format of the output file’s dimensions (choose from: classic, checkpoint, diagnostic), where classic denotes lat/lon and checkpoint / diagnostic are cubed-sphere formats

Keyword Args (optional):

cs_res_out: int: The cubed-sphere resolution of the output dataset. Not used if dim_format_out is classic Default value: 0
ll_res_out: str: The lat/lon resolution of the output dataset. Not used if dim_format_out is not classic Default value: ‘0x0’
sg_params_in: list[float, float, float]: Input grid stretching parameters [stretch-factor, target longitude, target latitude]. Not used if dim_format_in is classic Default value: [1.0, 170.0, -90.0] (No stretching)
sg_params_out: list[float, float, float]: Output grid stretching parameters [stretch-factor, target longitude, target latitude]. Not used if dim_format_out is classic Default value: [1.0, 170.0, -90.0] (No stretching)
vert_params_out: list(list, list) of list-like types: Hybrid grid parameter A in hPa and B (unitless) in [AP, BP] format. Needed for lat/lon output if not using full 72-level or 47-level grid Default value: [[], []]

gcpy.rename_restart_variables(ds, towards_gchp=True)

Renames restart variables according to GEOS-Chem Classic and GCHP conventions.

Args:

ds: xarray.Dataset: The input dataset

Keyword Args (optional):

towards_gchp: bool: Whether renaming to (True) or from (False) GCHP format Default value: True

Returns:

xarray.Dataset: Input dataset with variables renamed

gcpy.drop_and_rename_classic_vars(ds, towards_gchp=True)

Renames and drops certain restart variables according to GEOS-Chem Classic and GCHP conventions.

Args:

ds: xarray.Dataset: The input dataset

Keyword Args (optional):

towards_gchp: bool: Whether going to (True) or from (False) GCHP format Default value: True

Returns:

xarray.Dataset: Input dataset with variables renamed and dropped

gcpy.parser

gcpy.rotate_vectors(x, y, z, k, theta)

gcpy.cartesian_to_spherical(x, y, z)

gcpy.spherical_to_cartesian(x, y)

gcpy.schmidt_transform(x, y, s)

gcpy.scs_transform(x, y, s, tx, ty)

gcpy

Submodules

Package Contents

Classes

Functions

Attributes

Returns

Returns

Returns

`gcpy`