Dataset Properties

the plugin properties in this section are applied at the dataset level use these properties to set up the structure of datasets and to define the relationship between source rows and target rows description used to provide a human readable description of the mapping logic applied to map the source dataset/target datasets multi list element source dataset the main mapping element for the source to target dataset mapping you can specify multiple source dataset elements each element corresponds to one source dataset being processed to use multiple datasets as the source for a certain domain, a new mapping element must be created for each source source dataset identifier specifies the name of the source dataset used in the conversion if the source dataset is a raw source dataset file (such as, a dataset from an edc system in csv or xpt format) then this property must be the dataset name the dataset name is the name of the file if the source is an intermediary domain defined elsewhere in the asset group then provide the oid for the domain ensure that the case for this property is identical to the source dataset to use this functionality in the source dataset identifier users need to use the format “assetgroupname”\[]”sourcedataset” for example sdtm+\[]sp ds pivot type this allows you to pivot the data as well as filter out any blank rows that occur specify the type of pivot that will be used on the source dataset if not specified and pivot variable is provided, defaults to (h v) if not specified and pivot variable is not specified, defaults to (h h) pivot variable identifies the value list variable which is used as the pivot in a horizontal to vertical (h v) mapping if not specified, defaults to horizontal to horizontal (h h) mapping use to initiate the horizontal to vertical pivoting functionality this will be a choice list of all the variables within the domain by specifying a variable the etl tool will look for any valuelist items attached to this variable and create a row for each valuelist item in the output target dataset if the pivot variable is unspecified, the converter will default the output to have as many input rows as output rows the only exception to this is when the include rows custom element is specified output blank pivot rows used to allow blank rows in the pivot operation this defaults to ‘false’ to avoid blank rows but can be set to ‘true’ include rows this element is used to specify conditions under which certain rows in the source datasets will be used in the final target dataset condition multi element to allow as many conditions as the user requires to include the desired rows on output can join multiple conditions by using the logic operator property one level lower dataset join this element lets you join additional datasets to the source dataset the dataset join will keep records in the dataset after the join has been implemented lower levels of the elements specify how datasets can be joined together each join requires at least one source column to match a column in the join dataset below are some examples of how dataset join works it is a left join meaning all rows and columns on the left will be accessible and for any matches in columns between the join condition on the left and right datasets, those rows and all columns in the right dataset will become accessible these columns are accessible meaning users have the ability to make mappings at the variable level to get values from these columns in the examples when there is a “ “ this means blank it is not null value, it is treated as an empty string used to specify the dataset the user wishes to join onto the original source dataset in the example above this would be dsic join condition used to specify the variables in each respective datasets to initiate the join this is a multi element which the user can specify many variables to satisfy the join condition users need to specify a variable from the source dataset and then a variable from the join dataset users can specify multiple join conditions for a domain in the example below the dov dataset is being joined onto a domain there is already one condition to join when the subject columns are equal users can add a join by selecting the create element button on the top left of the dialog window this will create an extra join condition element now users can specify another condition as to when the source datasets should join together in this example the join is extended to include when the folder columns are equal to each other description used to provide a human readable summary of what the join conditions are comment mapping element used to specify the mapping which will store the comments collected for that domain this will be used for the special dataset type mapping for the comments (co) domain the comment element contains a basic mapping element as detailed below mapping element for the mapping element users are restricted to a subset of the basic mapping source types the most commonly used mappings are as follows none this will create a blank cell in the row source variable specify the name of the input column that you want to map to string set a hardocoded string value function allows you to set a function for this variable, for example set the input column to uppercase choice this allows you to set a when condition for this mapping see, docid\ lkoy jem8accm5liupzyq for more details special dataset type used to identify special case domains this is a multi choice list with the options of comments, supplemental qualifiers, and subject visits comments this is a special purpose mapping for the comments (co) domain the coseq must reflect the co records sequence idvar or idvarval must reflect the sequence number of the other target domains with comments attached supplemental qualifiers this is a special mapping for the supplemental qualifier for any domain (supp ) it automatically maps the variables given the specific domain is a supplemental qualifier this mapping automatically specifies qval as the pivot variable, and sets idvar as the –seq variable of the parent domain users are able to overwrite any of the individual mappings as required by simply setting the mappings on the individual elements subject visits this is a special purpose mapping for the subject visit (sv) domain all variables map as normal except 'svstdtc' (earliest date for a given visit) and 'svendtc' (latest date for a given visit) for every subject visit, sv records are produced and processed to find every unique visit records are searched to find the first and last dates for each visit this will default to use all the output datasets as a source sort output rows used to configure the sorting of output rows prior to the –seq being applied sort key use to specify the variables to sort the domain this is a multi element option to sort on multiple variables variable use to specify the variable that is to be sorted note the variable option provides a list of all the variables in the define file only variables in the current domain should be selected sort order used to specify the sort order of the variable in ascending or descending order de duplicate removes duplicates on the input columns include rows can be used to filter the input rows by specifying a variable to filter out dataset join used to join this dataset with another input dataset sql left join operation where the first dataset specified is primary dataset and the all other datasets are joined to the primary dataset columns are joined depending on the keys specified vertical to horizontal (v h) pivot example domain level mappings the following example data is to be pivoted from the vertical format to a horizontal format to initiate the vertical to horizontal pivot, set the following mappings at the source dataset level source dataset identifier type1 pivot type v h note the pivot variable property is only used for h v pivots once the pivot type has been set, provide the vertical pivot information this is specified under vertical pivot properties, in this example, we set the following vertical pivot value the source column which the pivot values will be read from in this example, the value is “testcd” values the values element contains the source columns where values are read from the values specified can be referenced in the source mappings in this example, the value is “result” keys the keys property lets you specify a set of source variables as keys these are used to identify which records in the source are to become one record in the target if the keys are not unique enough, then it can cause blank values on output you can specify multiple values and keys in this example, we leave this property blank variable level mappings once the mappings at the source dataset level are configured correctly, you can set the vertical pivot mappings at the variable level you must specify the mapping source type “pivot value” for each of the columns in the target dataset that will use the vertically pivoted values add a mapping group set source type to “pivot value” under pivot value set the following sources result value bpsys value number this can be used to specify the nth unique value in the pivot this is used with the value property to determine which values to extrapolate from the source source this is used to specify the source value to input in this variable you can use this to pick explicit values from the source column this needs to be used in conjunction with the value property value this is used to determine which source column is going to provide the values for the target column this should match one of the value properties specified at the source level