Configuration¶
A primary goal of odk2stata
is to be configurable, while using sensible defaults out of the box.
The configuration file can be created using:
$ python3 -m odk2stata.dofile.settings
and for version 0.2.5, the settings file looks like
[DEFAULT]
skip = False
omit = False
dataset_source = briefcase
which_label = first_label
extra_label = o2s_label
[destring]
odk_names_to_destring =
[drop_column]
types_to_drop = note
odk_names_to_drop =
odk_names_not_to_drop =
[encode_select_one]
encode_select_ones = True
encode_external_select_ones = False
odk_names_to_encode =
odk_names_not_to_encode =
choice_lists_not_to_encode =
number_column = o2s_number
strict_numbering = False
label_replace_column = first_label
[label_variable]
first_paragraph_only = False
remove_numbering = True
stop_at_words =
stop_before_words =
[metadata]
author =
timestamp_format = %Y-%m-%d, %H:%M:%S
case_preserve = False
merge_single_repeat = True
merge_append = True
odk2stata_version = 0.2.5
[rename]
direct_rename =
rename_to_odk_name = True
[split_select_multiple]
default_split_method = append_number
binary_option_label = yes_no
binary_label = o2s_binary_label
choices_to_exclude = negative_numbers
choice_lists_to_split =
choice_lists_not_to_split =
odk_names_to_split =
odk_names_not_to_split =
odk_names_to_append_name =
odk_names_to_append_number =
odk_names_to_name_only =
number_column = o2s_number
strict_numbering = False
An .ini file is broken into sections, demarcated by [section_heading]
.
The odk2stata
.ini settings file roughly has one section per area of work. Those
sections are described below.
Note
The .ini file is parsed using Python’s configparser
module. Therefore, the settings in [DEFAULT]
are applied to all other sections as default key-value pairs.
Note
odk2stata
uses single strings and lists as values in the .ini file. Lists are made by putting one entry per line. The second line and after need to be indented so that the parser does not confuse them as keys.
Section: DEFAULT
¶
- skip = False
- Should a section be skipped? Default is
False
, not to skip. - omit = False
- Should a section be omitted? Default is
False
, not to omit. - dataset_source = briefcase
- Where does a dataset come from? Options are
briefcase
,aggregate
, andno_groups
- which_label = first_label
- Which label in
survey
orchoices
should be used? This value can be the exact name of a column to specify that one, or the special termfirst_label
to mean the first label column on a sheet. - extra_label = o2s_label
- Which column should be used as a label override? Default is
o2s_label
, meaning that a new column should be added to the xlsform with a headero2s_label
. If a value is found in this column, it is used preferentially over the column forwhich_label
.
Section: destring
¶
This section handles the destringing portion of odk2stata.
Since output datafiles are CSV, all values are treated as strings on import and then destringed. odk2stata
knows to destring columns based off of integer and decimal types.
- odk_names_to_destring =
- A list of names of additional columns to destring.
Section: drop_column
¶
- types_to_drop = note
- ODK types to drop automatically. Default is
note
. ODK datasets include columns for notes, so notes are dropped automatically. - odk_names_to_drop =
- A list of ODK names (survey tab) to drop. Default is empty list.
- odk_names_not_to_drop =
- A list of ODK names not to drop. Default is empty list.
Section: encode_select_one
¶
- encode_select_ones = True
- Should
select_one
question types be encoded at all? Default isTrue
. - encode_external_select_ones = False
- Should external
select_one
question types be encoded at all? Default isFalse
. - odk_names_to_encode =
- A list of additional ODK names to encode.
- odk_names_not_to_encode =
- A list of ODK names not to encode.
- choice_lists_not_to_encode =
- A list of list names on the
choices
tab that should not be encoded. - number_column = o2s_number
- The column in the
choices
tab where the numbers to be used can be found. Default iso2s_number
. - strict_numbering = False
- Strict numbering means that if there are some choices without numbers in
number_column
, the program should error out. Default isFalse
. If numbers are missing in thenumber_column
then they are filled in starting with 1. - label_replace_column = first_label
- Encoding labels should be replaced with entries in this column. Default is
first_label
or the first column with a label.
Section: label_variable
¶
- first_paragraph_only = False
- Use the first paragraph only when labeling a variable.
- remove_numbering = True
- Remove question numbering at the beginning.
- stop_at_words =
- Use the label up to and including any of the words in this list.
- stop_before_words =
- Use the label up to but not including any of the words in this list.
Section: metadata
¶
- author =
- Who is the author of this Stata do file?
- timestamp_format = %Y-%m-%d, %H:%M:%S
- What is the timestamp format for dates and times?
- case_preserve = False
- Should case be preserved on the first Stata infile? Default is
False
so that case is not preserved. - merge_single_repeat = True
- If there is a single repeat group, should it be merged into the dataset? If there are multiple repeat groups, then each repeat group has its own do file section, since they generate their own datasets.
- merge_append = True
- If merging takes place should the do file append the variables to the end? If so, set this to
True
. False inserts the variables where they occur in the XLSForm. - odk2stata_version = 0.2.5
- What is the version of
odk2stata
? Default is generated from within the code.
Section: rename
¶
- direct_rename =
- A list of direct renames. An example is
var1 myVariable
. - rename_to_odk_name = True
- Should Stata variables be renamed to their ODK name? Default is
True
to do such renaming.
Section: split_select_multiple
¶
- default_split_method = append_number
- How should the new variables be named?
Default is
append_number
to take the original variable and append a number. Other options areappend_name
to take the originl variable and append the choice name andname_only
to use the choice name only as the new variable. - binary_option_label = yes_no
- What should the binary variables be labeled as? Default is
yes_no
for “Yes” and “No”. - binary_label = o2s_binary_label
- What should the Stata label name be in the do file? Default is
o2s_binary_label
. - choices_to_exclude = negative_numbers
- Which choices to exclude? Setting this to
negative_numbers
means that any choice name that is a negative number does not get a new variable. - choice_lists_to_split =
- Which choice lists should be split? This is a list.
- choice_lists_not_to_split =
- Which choice lists should not be split? This is a list.
- odk_names_to_split =
- Which ODK names should be split? This is a list.
- odk_names_not_to_split =
- Which ODK names should not be split? This is a list.
- odk_names_to_append_name =
- Which ODK names should be split in the
append_name
style? - odk_names_to_append_number =
- Which ODK names should be split in the
append_number
style? - odk_names_to_name_only =
- Which ODK names should be split in the
name_only
style? - number_column = o2s_number
- What column should be used to look for choice numbers? Default is
o2s_number
. - strict_numbering = False
- Strict numbering means that if there are some choices without numbers in
number_column
, the program should error out. Default isFalse
. If numbers are missing in thenumber_column
then they are filled in starting with 1.