FAQs

Data Preview Portal Questions

How can the variables list be searched or filtered?

To obtain all the variables contained in a questionnaire, type the two or three letter prefix (e.g. SDC for Socio-demographic Variables) into the full-text search box in the Variables listing, under “Variable properties > Name”. You can also use more general terms such as ‘food’, ‘work’, etc. (under “Variable properties > Label”) to find variables related to those terms, however, search terms are not exhaustive. For more information on the variables included in a questionnaire, please visit the Data Collection Tools under the Researchers tab.

What variables are included in each questionnaire?

How are multiple choice questions represented as variables in the CLSA dataset?

Multiple-choice questions are represented by either a single variable or multiple variables, depending on what the question allows:

- A question allowing only one response is represented by a single variable that can take on multiple values. Open-text responses are permitted in many questions; common and distinct responses are recoded to create new categories within the variable itself.

- For a question allowing multiple responses, each possible response category is assigned its own binary variable. Open-text responses are also permitted in many of these questions; common and distinct options are also recoded to create additional variables within the question scope. The number of variables corresponding to that question matches the number of response options.

Where can I find information about response categories for multiple choice questions?

In the variable view, clicking on the name of each variable reveals information that is more detailed. For example, under Categories, the variable information page will include the following information:

• Name: the value entered for a response in the questionnaire;

• Label: the response (or response category) corresponding to each value (Name);

• Missing: values corresponding to a question not answered, (don’t know or not applicable, refused).

Where can I find supplementary information about each of the variables?

Clicking on the variable name, in the variable view reveals more detailed information about that variable. (This function is not available for all study variables.) This information includes the question pertinent to the variable, the variable label, a list of the response option categories and some automatically generated summary statistics. In some instances, additional notes on skip patterns or references are included as well.

Data Access Questions

How do I get access to the data?

Researchers are invited to submit data access requests using Magnolia, the CLSA’s online data access application system. Requests to access the CLSA data are reviewed by the Data and Sample Access Committee (DSAC). Please consult the Data and Sample Access Policy and Guiding Principles, the pertinent sections of the CLSA protocol(s), the CLSA Data Collection Tools, and the information on the Data Access Application Process and deadlines, before preparing an application. The steps involved and the content required in the Magnolia data access application are outlined here.

When are the data access application deadlines?

There are multiple data access application deadlines per year. For upcoming dates, please see Application Deadlines under the Data Access section. If you are the recipient of a CIHR Catalyst Grant for the use of CLSA data, you do not need to submit to one of the application deadlines. Please contact access@clsa-elcv.ca for special instructions.

Do I need an institutional email address to access CLSA data?

Yes, anyone requiring access to CLSA data must use the email address of the institution to which they are affiliated. Data will only be released to institutional email addresses. Email addresses containing domain names such as Gmail, Hotmail, etc. are not acceptable.

How are alphanumeric data released to users?

The Data and Sample Access Committee (DSAC) reviews applications for the use of CLSA data and biospecimens and makes a recommendation to the CLSA Scientific Management Team (SMT). Once the project has been approved by the SMT, the CLSA Access Agreement has been signed, and proof of ethics approval has been received by the CLSA, the Data Curation Centre (DCC) will release the dataset(s) to the Primary Applicant. The link is valid for seven days and the number of downloads is determined by the number of Project Team members who have signed Schedule F of the CLSA Access Agreement, indicating that they require direct access to the data. It is the Primary Applicant’s responsibility to share the download link with the Project Team members who have signed Schedule F, and to ensure that all of the Project Team members respect the terms of the signed CLSA Access Agreement. Please refer to the sample CLSA Access Agreement for more information on the responsibilities of users.

Can I share the data?

Strict CLSA security and confidentiality rules are in place governing the use and sharing of CLSA data. The CLSA requires users to sign a CLSA Access Agreement that details the specific uses of data and the CLSA’s expectations with regard to privacy and confidentiality. Only the Primary Applicant and the Project Team members who have signed Schedule F of the CLSA Access Agreement or an approved Amendment are allowed to have direct access to the raw data. No approved user or member of their research team is allowed to share in whole, or in part the CLSA dataset with individuals who have not signed Schedule F of the CLSA Access Agreement, or an approved Amendment.

Can I add an investigator or a student to the project team, so that they may have access the data?

Yes, you can add personnel to your approved study while your CLSA Access Agreement is active. If your application was submitted in 2019 or later, you can create an Amendment request in Magnolia, using the Create Amendment button at the top right of the screen once you have selected the appropriate requisition.

For applications approved before 2019, please email access@clsa-elcv.ca, noting that you wish to amend your current project and provide the project ID number. You will be contacted by one of the team and your project will be entered into our online application program, Magnolia. Once completed, you can create your amendment in Magnolia. Once your amendment has been approved and the CLSA Access Agreement has been amended (if required), you will be notified for the next steps.

What is the format of the alphanumeric dataset when released?

Data are provided to researchers in comma-separated values (.csv) files. Please note that the Baseline CLSA alphanumeric dataset alone contains over 4,000 variables collected from more than 51,000 participants. Depending on your choice of statistical software and proposed analyses, automatic data imports may not succeed and you may need to instruct your software how to read the file. This may require the use of advanced scripting and/or macros in some cases. The CLSA encourages you to include someone experienced in working with such complex datasets on your project team.

How large are the datasets?

The Baseline alphanumeric data files are released in a zipped folder of approximately 85Mb and the Follow-up 1 alphanumeric data files in a folder of about 60Mb. For the size of images and raw data, please consult the CLSA Data Availability Table

How long do I have to use and analyse the data once I receive them?

Researchers who have received data will have a specified time within which the proposed analyses must be completed. This timeframe is defined in the CLSA Access Agreement. The timeline of the approved project is defined in the application upon submission in Magnolia, the duration of the CLSA Access Agreement is typically an additional two years after the project end date to allow for publication of results and is clearly indicated in the agreement. If the analyses are not completed in this period, the applicant must submit an Amendment request for a time extension or their data access agreement may be terminated (CLSA Access Agreement, Section 13). The CLSA will monitor the approved applications for adherence to the timeline.

How can I request biospecimens?

Biospecimens are not currently available. For questions related to biospecimens, please contact the Biorepository and Bioanalysis Centre (BBC) at bbc@clsa-elcv.ca.

Can data access be expedited?

Data access cannot be expedited; all interested researchers are required to follow the same application procedures to gain access to the CLSA dataset. Please see the Data Access Application Process for further information.

Where can I find information on projects using the CLSA dataset?

Approved Project Summaries are posted under the Researchers section of our website.

Can I submit an application for data access with objectives that are similar to an already approved project?

The CLSA encourages researchers to collaborate on similar research objectives. Summaries of approved projects are posted on the Approved Projects page and it is up to the applicants to determine if a similar application has already been approved. Please note that all applications submitted to the CLSA are reviewed and considered independently, even if similar research objectives are being proposed by different researchers.

Where can I find publications about and using the CLSA dataset?

Publications about the CLSA as well as those using CLSA data, can be found under Publications, in the Stay Informed section of our website.

Publications & Presentation Questions

Does the CLSA need to review my publications and presentations?

Final drafts of all manuscripts, reports, reviews, pre-prints and other proposed primary publications describing research using CLSA data and/or biospecimens must be sent to the CLSA, by the Primary Applicant, for review at least 15 working days prior to the anticipated submission. Abstracts, Posters and Presentations do not need to be submitted for review but should include appropriate Acknowledgements. Please review our Publication and Promotion Policy for CLSA Data Users for additional information.

Please submit all publications for review to access@clsa-elcv.ca.

Can I publish multiple peer-reviewed manuscripts based on my approved CLSA project?

As a publicly funded research platform, the CLSA encourages the dissemination of research findings from approved projects. The CLSA expects users to publish their findings in peer-reviewed journals. Multiple publications may be prepared based on a single approved project as long as the publications are directly linked to the objectives of the approved project.

How do I reference a questionnaire or module from the CLSA, if I use it in my own research?

All questionnaires and modules made available on our website must be referenced as appropriate. Please refer to the Publication and Promotion Policy for CLSA Data Users for the sources and associated Conditions of Use of the questionnaires used in the CLSA, and to section 2.7 of the Policy to know how to cite a questionnaire that has been modified by the CLSA.

Where do I find CLSA’s data availability statement?

Please consult Section 2.5 of our Publication and Promotion Policy for CLSA Data Users for our data availability statement.

Dataset Questions

How many participants are part of the CLSA at Baseline?

At Baseline, 21,241 participants were enrolled in the Tracking cohort and 30,097 participants in the Comprehensive cohort for a total of 51,338 CLSA participants.

The 30-minute Maintaining Contact Questionnaire (MCQ) interviews with additional health-related questions were completed approximately 18 months after the initial Baseline data collection. 19,052 Tracking participants and 28,789 Comprehensive participants completed the MCQ. The indicator variable ADM_COMPLETE_MCQ is included in the dataset to indicate those participants who completed the MCQ.

What were the Baseline exclusion criteria?

Please refer to Section 5.3 of the CLSA Protocol, available under the Researchers section of our website.

When were Baseline data collected?

Periods of data collection for the Baseline assessments were as follows:

Baseline Tracking: 2011-09 to 2014-05

Baseline Comprehensive: 2011-12 to 2015-07

Maintaining Contact Questionnaire (MCQ) Tracking: 2013-09 to 2016-02

Maintaining Contact Questionnaire (MCQ) Comprehensive: 2014-05 to 2016-01

When were Follow-Up 1 data collected?

Periods of data collection for Follow-Up 1 (FUP1) assessments were as follows:

FUP1 Tracking: 2014-05 to 2018-12

FUP1 Comprehensive: 2015-07 to 2018-12

Does the CLSA provide guidance on how to analyse my data?

No, it is not within the purview of the CLSA to advise approved users on statistical analyses for approved projects. Data Support Documentation is available under the Researchers tab of our website, including a detailed document on the use of Sampling Weights. For further help, please consult with a statistician.

Are bootstrap weights available for the analyses of CLSA data?

No, the CLSA does not have bootstrap weights for the dataset, and we are not planning to produce bootstrap weights in the near future.

Will the CLSA dataset be linked to provincial health administrative databases across Canada? When will these data be available?

CLSA is working centrally on strategies to link individual level CLSA data with data from health administrative databases across Canada. In 2021, the Health Data Research Network Canada (HDRN Canada) and the Canadian Longitudinal Study on Aging (CLSA) announced a new partnership to enable linkage of the CLSA cohort data with data held at provincial data centres. Please continue to monitor the website for updates.

Can I link the CLSA dataset to other third party data holdings that I have access to?

Linking of the CLSA data to third party data holdings by an approved user is prohibited. Any proposals for linkage must be approved by the CLSA Scientific Management Team, and executed internally by the CLSA. Six-digit postal codes or HIN data are never released to users.

Are CLSA data available in Research Data Centres (RDC)?

No, CLSA data are only available through a direct application to the CLSA. For more information on how to apply, please consult the Data Access Application Process section of our website.

What do blank values for a variable represent?

In general, variables in the CLSA dataset reflect the interview process. In some cases, follow-up questions were only asked if specific answers were given to preceding questions.

Blank values in the Baseline data can represent multiple types of missing data, including:

1) Valid skip patterns. For example, number of daughters and sons are only asked if the participant answered that they have at least one child. In the CLSA dataset, participants with no children will have blank values for both.

2) Missing data due to non-completion. There are some participants who skipped entire sections of baseline interview, and therefore have blanks for all the questions in those sections. Indicator variables such as ADM_COMPLETE_MCQ are provided in the documentation accompanying data release and should be consulted when there are large number of missing data to determine if it is due to a participant not completing a section.

In the Follow-up 1 dataset, missing data have been assigned various codes according to the reason the data are missing. Details of the different types of missing data are provided in the data dictionaries accompanying datasets.

What are derived variables?

Within the CLSA dataset, derived variables (DVs) are variables that are created from other variables. DVs are derived by re-grouping or re-classifying the original variables, to glean information otherwise not available. Some DVs are based on published measures or scales. You will find documentation related to DVs on our Data Support Documentation page, under the Researchers tab of our website.

How is participant death captured in the CLSA?

Participant death is currently captured in three ways: 1) from the next of kin contacting the CLSA directly, 2) through the ‘maintaining contact’ telephone calls that occur between main waves of data collection, or 3) from linkage to provincial vital statistics. Mortality data including cause of death are not yet available, however, vital statistics are released along with the Follow-up 1 dataset that provide information on the vital status of participants up to July 2019.

What if there appears to be an error or omission in the data that I receive?

The CLSA takes great care to check the accuracy and completeness of the data prior to release. However, because of the size of the dataset and the large number of variables, we cannot guarantee the accuracy, completeness, or fitness for any particular purpose of the data. It is the responsibility of each data user to verify their dataset, the accompanying data dictionaries and the Data Support Documentation available on our website. If you think your data are incomplete or if you identify errors while conducting your analyses, please contact us at access@clsa-elcv.ca.

How are CLSA datasets updated?

The CLSA updates its datasets on a regular basis with additional data, corrections and other updates. Such changes are always indicated by the version number of a dataset. When an update is ready, we send a Data Release Update email to all approved Primary Applicants, explaining the change(s). You will be able to request the updated dataset if you are approved for those data and you decide that the updates are relevant to your project.

Application Questions

Who can apply to use CLSA data?

The CLSA data are currently available to approved public sector researchers, with no preferential or exclusive access for any individual. The CLSA welcomes applications from graduate students and postdoctoral or clinical fellows who wish to use the CLSA data for their thesis or fellowship research, respectively. For trainee applications (Masters, PhD, postdoctoral or clinical fellows), the primary applicant must be the supervisor and hold an eligible appointment (continuing or term appointment) at an eligible institution that is able to uphold the conditions of the data access agreement, administer grant funds and provide Research Ethics Board approval.

As an international researcher, can I apply to access CLSA data?

Yes, investigators affiliated with public research organizations outside Canada can apply to access alphanumeric data collected as part of the CLSA.

How can I determine if my research project is feasible?

The CLSA DataPreview Portal has been designed to help researchers browse available variables in the CLSA dataset and find basic frequencies. Should you need additional information not available through the Portal to determine the feasibility of your proposal, please contact us on access@clsa-elcv.ca. We can provide simple cross-tabulations (of two or three variables) from cross-sectional data. While we do our utmost to respond to data queries to help potential users ensure that their proposal is feasible, please note that we are not resourced to provide statistics on change across time-points or more in-depth statistical support.

Will my proposal undergo a scientific review?

Evidence of peer-reviewed funding will be considered evidence of scientific review for data access applications. You must provide proof of the peer-reviewed funding for the specific project in your application. If there are no plans to submit an application for financial support for your project, or if it is a trainee project, please provide evidence of peer review (e.g., internal Departmental review; thesis protocol defense, etc.) if available. Awards of funding not specific to the proposed project (i.e., student fellowships) are not considered proof of peer review. If no evidence of scientific peer review is provided with the application, the project will undergo scientific review by the Data and Sample Access Committee.

Does my project need to have secured funding before I apply?

No, you do not need to secure funding before applying to request CLSA data. If funding has been requested, but not yet approved, please provide the name of the funding agency in the appropriate section of the application. If your project is approved, and once the CLSA Access Agreement has been signed, the CLSA Financial Administrator will contact the Primary Applicant with the invoice for payment of the fees.

Do I have to obtain ethics approval for my project?

Yes. Please note that ethics approval is not required at the time of the application to use CLSA data, but no data or biospecimens will be released until proof of ethics approval has been received by the CLSA. Should your institution not require a full ethical review for the use of de-identified data, please provide a letter from your Institutional Review Board to this effect. Ethics approval must be obtained only from the Primary Applicant’s institution, not from all of the institutions of the members of the Project Team.

What are the fees for access to CLSA data?

Currently the charge for partial cost recovery per application is $3,000 CAD for applications within Canada and $5,000 USD for international applications. This includes access to the Baseline, Follow-up 1 or both datasets. Due to the additional work required to make some data research ready, additional fees apply for access to image files, raw data and datasets that require more complex customization. Please see the table of additional fees on the Fees page of our website. The fees are payable for the retrieval and preparation of the dataset per approved project, not for each Project Team member.

Do trainees have to pay for access to CLSA data?

Graduate students (Masters or PhD) who wish to obtain the CLSA data for the sole purpose of their thesis, and postdoctoral or clinical fellows (limit 1 waiver per fellowship) who wish to obtain the CLSA data for the sole purpose of their postdoctoral or clinical fellowship project, who are enrolled at Canadian institutions for their graduate degree, postdoc, or clinical fellowship, can apply for a fee waiver. Canadian trainees working outside Canada but funded through a Canadian source are also eligible for a fee waiver. The request for a fee waiver must be checked in Part 1 of the CLSA Data and Biospecimen Access Request Application. Applications made through the CIHR Catalyst Grant for the Analysis of CLSA Data are not eligible for trainee fee waivers. Fee waivers do not include images and raw data.

If I have a student or trainee as part of my project team, am I eligible to get a fee waiver for data access?

No, simply having trainees as part of the Project Team does not satisfy criteria for eligibility for a fee waiver.

When will data from Follow-up 2 be available?

We anticipate the first release of Follow-up 2 (FUP2) alphanumeric data in 2022. Applications to access data from FUP2 are not accepted until those data are research ready. Please check the CLSA Data Availability Table for current data availability that may be requested as part of a data access application.

How long will it take to receive my dataset?

Once you submit your CLSA Data and Biospecimen Request Application, you can track the progress of your application through the online application system. (Once you have begun the application process, it is your responsibility to check your email (including your folders for junk, spam, etc.) for notifications from Magnolia. You may be contacted by the Data Access Team, if additional information is required. You will be notified about the approval status of your application approximately three months after the submission deadline. If your application is approved, a CLSA Access Agreement must be negotiated and signed between McMaster University and your institution. This part of the process can take a variable length of time (up to an additional three (3) months) and is not under the control of the CLSA. You will also need to provide evidence of ethics approval for your project, if you had not done so within your initial application. Please be aware that these steps may affect the length of time that it takes for the data to be released to you. Once all parties have signed the CLSA Access Agreement and proof of ethics approval has been received by the CLSA, your data will be released within 10 working days. Please note that the release of additional data (images and raw data) will require more time (one to three months, depending on the request). When planning for your project, you must include in your timeframe at least six (6) months from the application submission deadline to the time you receive your dataset, if your application is approved.

Can my application be rejected?

The goal of the CLSA is to enable data access as far as possible, however, an application for data access can be rejected. There may be some instances when an applicant is asked to revise and resubmit an application at the recommendation of the Data and Sample Access Committee (DSAC). To avoid applications being sent to the DSAC that are not appropriate, we try to work with interested researchers before the application is submitted, to provide them with information about the available data and feasibility of a project. During the application process itself, we ask applicants to correct errors and omissions and provide feedback from the DSAC review, so that applicants can clarify or revise and resubmit a proposal. Failure to provide the level of detail sufficient to assess the study feasibility could lead to the application being not approved.

What do I do if I have problems completing the online application?

Should you encounter any issues completing the online application, please contact us via access@clsa-elcv.ca.

Canadian Longitudinal Study on Aging