SAS Interview Q&A:Clinical trials
1.Describe the phases of clinical trials?
Ans:- These are the following four phases of the clinical trials:
Phase 1: Test a new drug or treatment to a small group of people (20-80) to evaluate its safety.
Phase 2: The experimental drug or treatment is given to a large group of people (100-300) to see that the drug is effective or not for that treatment.
Phase 3: The experimental drug or treatment is given to a large group of people (1000-3000) to see its effectiveness, monitor side effects and compare it to commonly used treatments.Phase 4: The 4 phase study includes the post marketing studies including the drug's risk, benefits etc.
2. Describe the validation procedure? How would you perform the validation for TLG as well as analysis data set?
Ans:- Validation procedure is used to check the output of the SAS program, generated by the source programmer. In this process validator write the program and generate the output. If this output is same as the output generated by the SAS programmer's output then the program is considered to be valid. We can perform this validation for TLG by checking the output manually and for analysis data set it can be done using PROC COMPARE.
3. How would you perform the validation for the listing, which has 400 pages?
Ans:- It is not possible to perform the validation for the listing having 400 pages manually. To do this, we convert the listing in data sets by using PROC RTF and then after that we can compare it by using PROC COMPARE.
4. Can you use PROC COMPARE to validate listings? Why?
Ans:- Yes, we can use PROC COMPARE to validate the listing because if there are many entries (pages) in the listings then it is not possible to check them manually. So in this condition we use PROC COMPARE to validate the listings.
5. How would you generate tables, listings and graphs?
Ans:- We can generate the listings by using the PROC REPORT. Similarly we can create the tables by using PROC FREQ, PROC MEANS, and PROC TRANSPOSE and PROC REPORT. We would generate graph, using proc Gplot etc.
6. How many tables can you create in a day?
Ans:- Actually it depends on the complexity of the tables if there are same type of tables then, we can create 4-5 tables in a day.
7. What are all the PROCS have you used in your experience?
Ans:- I have used many procedures like proc report, proc sort, proc format etc. I have used proc report to generate the list report, in this procedure I have used subjid as order variable and trt_grp, sbd, dbd as display variables.
8. Describe the data sets you have come across in your life?
Ans:- I have worked with demographic, adverse event , laboratory, analysis and other data sets.
9. How would you submit the docs to FDA? Who will submit the docs?
Ans:- We can submit the docs to FDA by e-submission. Docs can be submitted to FDA using21 CRF part 11 forms. In this doc we have the documentation about macros and program and E-records also. Statistician or project manager will submit this doc to FDA.
10. What are the docs do you submit to FDA?
Ans:- We submit ISS and ISE documents to FDA.
11. Can u share your CDISC experience? What version of CDISC have you used?
Ans: I didn't get any chance to work in CDISC extensively. But I have helped my project manager and statistician in CDISC. I have used version 3 of the CDISC.
12. Tell me the importance of the SAP?
Ans:- This document contains detailed information regarding study objectives and statistical methods to aid in the production of the Clinical Study Report (CSR) including summary tables, figures, and subject data listings for Protocol. This document also contains documentation of the program variables and algorithms that will be used to generate summary statistics and statistical analysis.
13. Tell me about your project group? To whom you would report/contact?
Ans:-My project group consisting of six members, a project manager, two statisticians, lead programmer and two programmers.I would report to the lead programmer. If I have any problem regarding the programming I would contact the lead programmer. If I have any doubt in values of variables in raw dataset I would contact the statistician. For example the dataset related to the menopause symptoms in women, if the variable sex having the values like F, M. I would consider it as wrong; in that type of situations I would contact the statistician.
14. Explain SAS documentation?
Ans:-SAS documentation includes programmer header, comments, titles, footnotes etc. Whatever we type in the program for making the program easily readable, easily understandable are in called as SAS documentation.
15. How would you know whether the program has been modified or not?
Ans:-I would know the program has been modified or not by seeing the modification history in the program header.
16. Project status meeting?
Ans:-It is a planetary meeting of all the project managers to discuss about the present Status of the project in hand and discuss new ideas and options in improving the Way it is presently being performed.
17. Describe clin-trial data base and oracle clinical?
Ans:-Clintrial, the market's leading Clinical Data Management System (CDMS).Oracle Clinical or OC is a database management system designed by Oracle to provide data management, data entry and data validation functionalities to Clinical Trials process.
18. Tell me about MEDRA and what version of MEDRA did you use in your project?
Ans;-Medical dictionary of regulatory activities. Version 1019. Describe SDTM?CDISC’s Study Data Tabulation Model (SDTM) has been developed to standardize what is submitted to the FDA.
20. What is CRT?
Ans:-Case Report Tabulation
21. What is annotated CRF?
Ans:-Case report form, it’s a collection of the forms of all the patients in the trial.
22. What do you know about 21CRF PART 11?
Ans:-Title 21 CFR Part 11 of the Code of Federal Regulations deals with the FDA guidelines on electronic records and electronic signatures in the United States. Part 11, as it is commonly called, defines the criteria under which electronic records and electronic signatures are considered to be trustworthy, reliable and equivalent to paper records.
23. Have you did validation in your projects?
Ans:-I did validation of the fellow programmers work to ensure that the logic and intent of the program is correct and that data errors are detected.
e.gVerify error and warning messages are generated when the macro is called more than 10 times which means to add more than 10 titles. Verify the error message when TITLENUM parameter is invalid.
Verify a warning message is generated if the total length of texts specified in the input parameters LEFT, CENTER, and RIGHT is greater than 132 characters. Also verify precedence is given to string in input parameter LEFT if the total string length is more than 132 characters.
Verify there is no error/warning message generated if the macro is used within a data step and all input parameters are valid.
24. What are the contents of AE dataset? What is its purpose? What are the variables in adverse event datasets?
Ans:-The adverse event data set contains the SUBJID, body system of the event, the preferred term for the event, event severity. The purpose of the AE dataset is to give a summary of the adverse event for all the patients in the treatment arms to aid in the inferential safety analysis of the drug.
25. What are the contents of lab data? What is the purpose of data set?
Ans:-The lab data set contains the SUBJID, week number, and category of lab test, standard units, low normal and high range of the values. The purpose of the lab data set is to obtain the difference in the values of key variables after the administration of drug.
26.Tell me about this company in India? How big it is? Why are they using SAS?
Ans:-CENTRAL DRUGS STANDARD CONTROL ORGANIZATION
Human/Clinical pharmacology trials (phase I)
Exploratory trials (Phase II)
Confirmatory trials (Phase III)
About ACT – turnover of around $30 million dollars. Headquarters in India.
27.Have you created CRT’s, if you have, tell me what have you done in that?
Ans:-Yes I have created patient profile tabulations as the request of my manager and and the statistician. I have used PROC REPORT and Proc SQl to create simple patient listing which had all information of a particular patient including age, sex, race etc.
28. Have you created transport files?
Ans:-Yes, I have created SAS Xport transport files using Proc Copy and data step for the FDA submissions. These are version 5 files. we use the libname engine and the Proc Copy procedure, One dataset in each xport transport format file. For version 5: labels no longer than 40 bytes, variable names 8 bytes, character variables width to 200 bytes. If we violate these constraints your copy procedure may terminate with constraints, because SAS xport format is in compliance with SAS 5 datasets.
Libname sdtm “c:\sdtm_data”;
Libname dm xport “c:\dm.xpt”;
Proc copy;
In = sdtm;
Out = dm;
Select dm;
Run;
29. How did you do data cleaning? How do you change the values in the data on your own?
Ans:-I used proc freq and proc univariate to find the discrepancies in the data, which I reported to my manager.
30. Definitions?
Ans:-CDISC- Clinical data interchange standards consortium.
They have different data models, which define clinical data standards for pharmaceutical industry.
SDTM – It defines the data tabulation datasets that are to be sent to the FDA for regulatory submissions. (CRT’s)
ADaM – (Analysis data Model)Defines data set definition guidance for creating analysis data sets.
ODM – XML – based data model for allows transfer of XML based data .
Define.xml – for data definition file (define.pdf) which is machine readable.
ICH E3: Guideline, Structure and Content of Clinical Study Reports
ICH E6: Guideline, Good Clinical Practice
ICH E9: Guideline, Statistical Principles for Clinical Trials
Title 21 Part 312.32: Investigational New Drug Application
31. have you ever done any Edit check programs in your project, if you have, tell me what do you know about edit check programs?
Ans:-Yes I have done edit check programs .
Edit check programs – Data validation.
1.Data Validation – proc means, proc univariate, proc freq.
Data Cleaning – finding errors.
2.Checking for invalid character values.
Proc freq data = patients;
Tables gender dx ae / nocum nopercent;
Run;
Which gives frequency counts of unique character values.
3. Proc print with where statement to list invalid data values.
[systolic blood pressure - 80 to 100]
[diastolic blood pressure – 60 to 120]
4. Proc means, univariate and tabulate to look for outliers.
Proc means – min, max, n and mean.
Proc univariate – five highest and lowest values
[ stem leaf plots and box plots]
5. PROC FORMAT – range checking
6. Data Analysis – set, merge, update, keep, drop in data step.
7. Create datasets – PROC IMPORT and data step from flat files.
8. Extract data – LIBNAME.
9. SAS/STAT – PROC ANOVA, PROC REG.
10. Duplicate Data – PROC SORT Nodupkey or Noduplicate
Nodupkey – only checks for duplicates in BY
Noduplicate – checks entire observation (matches all variables)
For getting duplicate observations first sort BY nodupkey and merge it back to the original dataset and keep only records in original and sorted.
11.For creating analysis datasets from the raw data sets I used the PROC FORMAT, and rename and length statements to make changes and finally make a analysis data set.
32. What is Verification?
Ans:-The purpose of the verification is to ensure the accuracy of the final tables and the quality of SAS programs that generated the final tables. According to the instructions SOP and the SAP I selected the subset of the final summary tables for verification. E.g Adverse event table, baseline and demographic characteristics table.
The verification results were verified against with the original final tables and all discrepancies if existed were documented.
33. What is ANNOTATED CRF?
Ans:-An annotated CRF is a CRF in which the variable names are written next to the spaces provided for the investigator. It serves as a link between the database/data sets and the questions on the CRF.
34. What is Program Validation?
Ans:-Its same as macro validation except here we have to validate the programs i.e according to the SOP I had to first determine what the program is supposed to do, see if they work as they are supposed to work and create a validation document mentioning if the program works properly and set the status as pass or fail.Pass the input parameters to the program and check the log for errors.
35. What do you lknow about ISS and ISE, have you ever produced these reports?
Ans:-ISS (Integrated summary of safety):
Integrates safety information from all sources (animal, clinical pharmacology, controlled and uncontrolled studies, epidemiologic data). "ISS is, in part, simply a summation of data from individual studies and, in part, a new analysis that goes beyond what can be done with individual studies."
ISE (Integrated Summary of efficacy)
ISS & ISE are critical components of the safety and effectiveness submission and expected to be submitted in the application in accordance with regulation. FDA’s guidance Format and Content of Clinical and Statistical Sections of Application gives advice on how to construct these summaries. Note that, despite the name, these are integrated analyses of all relevant data, not summaries.
36.How did you do data cleaning? How do you change the values in the data on your own?
Ans:-I used proc freq and proc univariate to find the discrepancies in the data, which I reported to my manager.
32. What is Verification?
Ans:-The purpose of the verification is to ensure the accuracy of the final tables and the quality of SAS programs that generated the final tables. According to the instructions SOP and the SAP I selected the subset of the final summary tables for verification. E.g Adverse event table, baseline and demographic characteristics table.
The verification results were verified against with the original final tables and all discrepancies if existed were documented.
33. What is ANNOTATED CRF?
Ans:-An annotated CRF is a CRF in which the variable names are written next to the spaces provided for the investigator. It serves as a link between the database/data sets and the questions on the CRF.
34. What is Program Validation?
Ans:-Its same as macro validation except here we have to validate the programs i.e according to the SOP I had to first determine what the program is supposed to do, see if they work as they are supposed to work and create a validation document mentioning if the program works properly and set the status as pass or fail.Pass the input parameters to the program and check the log for errors.
35. What do you lknow about ISS and ISE, have you ever produced these reports?
Ans:-ISS (Integrated summary of safety):
Integrates safety information from all sources (animal, clinical pharmacology, controlled and uncontrolled studies, epidemiologic data). "ISS is, in part, simply a summation of data from individual studies and, in part, a new analysis that goes beyond what can be done with individual studies."
ISE (Integrated Summary of efficacy)
ISS & ISE are critical components of the safety and effectiveness submission and expected to be submitted in the application in accordance with regulation. FDA’s guidance Format and Content of Clinical and Statistical Sections of Application gives advice on how to construct these summaries. Note that, despite the name, these are integrated analyses of all relevant data, not summaries.
36. Explain the process and how to do Data Validation?
Ans:-I have done data validation and data cleaning to check if the data values are correct or if they conform to the standard set of rules.
A very simple approach to identifying invalid character values in this file is to use PROC FREQ to list all the unique values of these variables. This gives us the total number of invalid observations. After identifying the invalid data …we have to locate the observation so that we can report to the manager the particular patient number.
Invalid data can be located using the data _null_ programming. Following is e.g
DATA _NULL_;
INFILE "C:PATIENTS,TXT" PAD;
FILE PRINT; ***SEND OUTPUT TO THE
OUTPUT WINDOW;
TITLE "LISTING OF INVALID DATA";
***NOTE: WE WILL ONLY INPUT THOSE
VARIABLES OF INTEREST;
INPUT @1 PATNO $3.
@4 GENDER $1.
@24 DX $3.
@27 AE $1.;
***CHECK GENDER;
IF GENDER NOT IN ('F','M',' ') THEN
PUT PATNO= GENDER=;
***CHECK DX;
IF VERIFY(DX,' 0123456789') NE 0
THEN PUT PATNO= DX=;
***CHECK AE;
IF AE NOT IN ('0','1',' ') THEN PUT
PATNO= AE=;
RUN;
For data validation of numeric values like out of range or missing values I used proc print with a where statement.
PROC PRINT DATA=CLEAN.PATIENTS;
WHERE HR NOT BETWEEN 40 AND 100 AND
HR IS NOT MISSING OR
SBP NOT BETWEEN 80 AND 200 AND
SBP IS NOT MISSING OR
DBP NOT BETWEEN 60 AND 120 AND
DBP IS NOT MISSING;
TITLE "OUT-OF-RANGE VALUES FOR NUMERIC
VARIABLES";
ID PATNO;
VARHR SBP DBP;
RUN;
If we have a range of numeric values ‘001’ – ‘999’ then we can first use user defined format and then use proc freq to determine the invalid values.
PROC FORMAT;
VALUE $GENDER 'F','M' = 'VALID'
' ' = 'MISSING'
OTHER = 'MISCODED';
VALUE $DX '001' - '999'= 'VALID'
' ' = 'MISSING'
OTHER = 'MISCODED';
VALUE $AE '0','1' = 'VALID'
' ' = 'MISSING'
OTHER = 'MISCODED';
RUN;
One of the simplest ways to check for invalid numeric values is to run either PROC MEANS or PROC UNIVARIATE.
We can use the N and NMISS options in the Proc Means to check for missing and invalid data. Default (n nmiss mean min max stddev).
The main advantage of using PROC UNIVARIATE (default n mean std skewness kurtosis) is that we get the extreme values i.e lowest and highest 5 values which we can see for data errors. If u want to see the patid for these particular observations …..state and ID patno statement in the univariate procedure.
37. Roles and responsibilities?
Ans:-Programmer: Develop programming for report formats (ISS & ISE shell) required by the regulatory authorities.
Update ISS/ISE shell, when required.
Clinical Study Team:
Provide information on safety and efficacy findings, when required.
Provide updates on safety and efficacy findings for periodic reporting.
Study Statistician
Draft ISS and ISE shell.
Update shell, when appropriate.
Analyze and report data in approved format, to meet periodic reporting requirements.
38. Explain Types of Clinical trials study you come across?
Single Blind Study
When the patients are not aware of which treatment they receive
Double Blind Study
When the patients and the investigator are unaware of the treatment group assigned
Triple Blind Study
Triple blind study is when patients, investigator, and the project team are unaware of the treatments administered.
39. What are the domains/datasets you have used in your studies?
Demog
Adverse Events
Vitals
ECG
Labs
Medical History
PhysicalExam
40. Can you list the variables in all the domains?
Ans:-
Demog: Patient Id, Age, Sex, Race, Screening Weight, Screening Height, BMI
Adverse Events: Protocol no, Investigator no, Patient Id, Preferred Term, Investigator Term, (Abdominal dis, Freq urination, headache, dizziness, hand-food syndrome, rash, Leukopenia, Neutropenia) Severity, Seriousness (y/n), Seriousness Type (death, life threatening, permanently disabling), Visit number, Start time, Stop time, Related to study drug?
Vitals: Subject number, Study date, Procedure time, Sitting blood pressure, Sitting Cardiac Rate, Visit number, Change from baseline, Dose of treatment at time of vital sign, Abnormal (yes/no), BMI, Systolic blood pressure, Diastolic blood pressure.
ECG: Subject no, Study Date, Study Time, Visit no, PR interval (msec), QRS duration (msec), QT interval (msec), QTc interval (msec), Ventricular Rate (bpm), Change from baseline, Abnormal.
Labs: Subject no, Study day, Lab parameter (Lparm), lab units, ULN (upper limit of normal), LLN (lower limit of normal), visit number, change from baseline, Greater than ULN (yes/no), lab related serious adverse event (yes/no).
Medical History: Medical Condition, Date of Diagnosis (yes/no), Years of onset or occurrence, Past condition (yes/no), Current condition (yes/no).
PhysicalExam: Subject no, Exam date, Exam time, Visit number, Reason for exam, Body system, Abnormal (yes/no), Findings, Change from baseline (improvement, worsening, no change), Comments
41. Give me the example of edit ckecks you made in your programs?
Examples of Edit Checks
Demog:
Weight is outside expected range
Body mass index is below expected ( check weight and height)
Age is not within expected range
DOB is greater than the Visit date
Invalid Gender value
Adverse
Stop is before the start or visit
Start is before birthdate
Study medicine discontinued due to adverse event but completion indicated (COMPLETE =1)
Labs
Result is within the normal range but abnormal is not blank or ‘N’
Result is outside the normal range but abnormal is blank
Vitals
Diastolic BP > Systolic BP
Medical History
Visit date prior to Screen date
Physical
Physical exam is normal but comment included
42. What are the advantages of using SAS in clinical data management? Why should not we use other software products in managing clinical data? ADVANTAGES OF USING A SAS®-BASED SYSTEM
Ans:-Less hardware is required. A Typical SAS®-based system can utilize a standard file server to store its databases and does not require one or more dedicated servers to handle the application load. PC SAS® can easily be used to handle processing, while data access is left to the file server. Additionally, as presented later in this paper, it is possible to use the SAS® product SAS®/Share to provide a dedicated server to handle data transactions.
Fewer personnel are required. Systems that use complicated database software often require the hiring of one ore more DBA’s (Database Administrators) who make sure the database software is running, make changes to the structure of the database, etc. These individuals often require special training or background experience in the particular database application being used, typically Oracle. Additionally, consultants are often required to set up the system and/or studies since dedicated servers and specific expertise requirements often complicate the process.
Users with even casual SAS® experience can set up studies. Novice programmers can build the structure of the database and design screens. Organizations that are involved in data management almost always have at least one SAS® programmer already on staff. SAS® programmers will have an understanding of how the system actually works which would allow them to extend the functionality of the system by directly accessing SAS® data from outside of the system.
Speed of setup is dramatically reduced. By keeping studies on a local file server and making the database and screen design processes extremely simple and intuitive, setup time is reduced from weeks to days.
All phases of the data management process become homogeneous. From entry to analysis, data reside in SAS® data sets, often the end goal of every data management group. Additionally, SAS® users are involved in each step, instead of having specialists from different areas hand off pieces of studies during the project life cycle.
No data conversion is required. Since the data reside in SAS® data sets natively, no conversion programs need to be written.
Data review can happen during the data entry process, on the master database. As long as records are marked as being double-keyed, data review personnel can run edit check programs and build queries on some patients while others are still being entered.
Tables and listings can be generated on live data. This helps speed up the development of table and listing programs and allows programmers to avoid having to make continual copies or extracts of the data during testing.
43. Have you ever had to follow SOPs or programming guidelines?
Ans:-SOP describes the process to assure that standard coding activities, which produce tables, listings and graphs, functions and/or edit checks, are conducted in accordance with industry standards are appropriately documented.
It is normally used whenever new programs are required or existing programs required some modification during the set-up, conduct, and/or reporting clinical trial data.
44. Describe the types of SAS programming tasks that you performed: Tables? Listings? Graphics? Ad hoc reports? Other?
Ans:-Prepared programs required for the ISS and ISE analysis reports. Developed and validated programs for preparing ad-hoc statistical reports for the preparation of clinical study report. Wrote analysis programs in line with the specifications defined by the study statistician. Base SAS (MEANS, FREQ, SUMMARY, TABULATE, REPORT etc) and SAS/STAT procedures (REG, GLM, ANOVA, and UNIVARIATE etc.) were used for summarization, Cross-Tabulations and statistical analysis purposes. Created Statistical reports using Proc Report, Data _null_ and SAS Macro. Created, derived and merged and pooled datasets,listings and summary tables for Phase-I and Phase-II of clinical trials.
45. Have you been involved in editing the data or writing data queries?
Ans:-If your interviewer asks this question, the u should ask him what he means by editing the data… and data queries…
46. Are you involved in writing the inferential analysis plan? Table’s specifications?
47. What do you feel about hardcoding?
Ans:-Programmers sometime hardcode when they need to produce report in urgent. But it is always better to avoid hardcoding, as it overrides the database controls in clinical data management. Data often change in a trial over time, and the hardcode that is written today may not be valid in the future.Unfortunately, a hardcode may be forgotten and left in the SAS program, and that can lead to an incorrect database change.
48. How do you write a test plan?
Ans:-Before writing "Test plan" you have to look into on "Functional specifications". Functional specifications itself depends on "Requirements", so one should have clear understanding of requirements and functional specifications to write a test plan.
49. What is the difference between verification and validation?
Ans:-Although the verification and validation are close in meaning, "verification" has more of a sense of testing the truth or accuracy of a statement by examining evidence or conducting experiments, while "validate" has more of a sense of declaring a statement to be true and marking it with an indication of official sanction.
50.What other SAS features do you use for error trapping and data validation?
Ans:-Conditional statements, if then else.
Put statement
Debug option.
51. What is PROC CDISC?
Ans:-It is new SAS procedure that is available as a hotfix for SAS 8.2 version and comes as a part with
SAS 9.1.3 version. PROC CDISC is a procedure that allows us to import (and export XML files that are compliant with the CDISC ODM version 1.2 schema. For more details refer SAS programming in the Pharmaceutical Industry text book.
52. What is LOCF?
Ans:-Pharmaceutical companies conduct longitudinalstudies on human subjects that often span several months. It is unrealistic to expect patients to keep every scheduled visit over such a long period of time.Despite every effort, patient data are not collected for some time points. Eventually, these become missing values in a SAS data set later. For reporting purposes,
the most recent previously available value is substituted for each missing visit. This is called the Last Observation Carried Forward (LOCF).
LOCF doesn't mean last SAS dataset observation carried forward. It means last non-missing value carried forward. It is the values of individual measures that are the "observations" in this case. And if you have multiple variables containing these values then they will be carried forward independently.
SAS Interview Questions:General
1.Why is a STOP statement needed for the POINT= option on a SET statement?
Ans:-Because POINT= reads only the specified observations, SAS cannot detect an end-of-file condition as it would if the file were being read sequentially.
2.How do you control the number of observations and/or variables read or written?
Ans:-FIRSTOBS and OBS option
3.Approximately what date is represented by the SAS date value of 730?
Ans:-31st December 1961
4.Identify statements whose placement in the DATA step is critical.
A: INPUT, DATA and RUN…
5.Does SAS 'Translate' (compile) or does it 'Interpret'? Explain.
A) Compile
6.What does the RUN statement do?
Ans:- When SAS editor looks at Run it starts compiling the data or proc step, if you have more than one data step or proc step or if you have a proc step Following the data step then you can avoid the usage of the run statement.
7.Why is SAS considered self-documenting?
A) SAS is considered self documenting because during the compilation time it creates and stores all the information about the data set like the time and date of the data set creation later No. of the variables later labels all that kind of info inside the dataset and you can look at that infousing proc contents procedure.
8.What are some good SAS programming practices for processing very large data sets?
A) Sort them once, can use firstobs = and obs = ,
9.What is the different between functions and PROCs that calculate thesame simple descriptive statistics?
A)Functions can used inside the data step and on the same data set but with proc's you can create a new data sets to output the results. May be more ...........
10.If you were told to create many records from one record, show how youwould do this using arrays and with PROC TRANSPOSE?
A) I would use TRANSPOSE if the variables are less use arrays if the var are more ................. depends
11.What is a method for assigning first.VAR and last.VAR to the BY groupvariable on unsorted data?
A) In Unsorted data you can't use First. or Last.
12.How do you debug and test your SAS programs?
A) First thing is look into Log for errors or warning or NOTE in some cases or use the debugger in SAS data step.
13.What other SAS features do you use for error trapping and datavalidation?
A) Check the Log and for data validation things like Proc Freq, Proc means or some times proc print to look how the data looks like ........
14.How would you combine 3 or more tables with different structures?
A) I think sort them with common variables and use merge statement. I am not sure what you mean different structures.
15.What areas of SAS are you most interested in?
Ans:-BASE, STAT, GRAPH, ETS
16.Briefly describe 5 ways to do a "table lookup" in SAS.
Ans:-Match Merging, Direct Access, Format Tables, Arrays, PROC SQL
17.What versions of SAS have you used (on which platforms)?
Ans:-SAS 8.2 in Windows and UNIX, SAS 7 and 6.12
18.What are some good SAS programming practices for processing very large data sets?
Ans:-Sampling method using OBS option or subsetting, commenting the Lines, Use Data Null
19.What are some problems you might encounter in processing missing values? In Data steps? Arithmetic? Comparisons? Functions? Classifying data?
Ans:-The result of any operation with missing value will result in missing value. Most SAS statistical procedures exclude observations with any missing variable values from an analysis.
20.How would you create a data set with 1 observation and 30 variables from a data set with 30 observations and 1 variable?
Ans:-Using PROC TRANSPOSE
21.What is the different between functions and PROCs that calculate the same simple descriptive statistics?
Ans:-Proc can be used with wider scope and the results can be sent to a different dataset. Functions usually affect the existing datasets.
22.If you were told to create many records from one record, show how you would do this using array and with PROC TRANSPOSE?
Ans:-Declare array for number of variables in the record and then used Do loopProc Transpose with VAR statement
23.What are _numeric_ and _character_ and what do they do?
Ans:-Will either read or writes all numeric and character variables in dataset.
24.How would you create multiple observations from a single observation?
Ans:-Using double Trailing @@
25.For what purpose would you use the RETAIN statement?
Ans:-The retain statement is used to hold the values of variables across iterations of the data step. Normally, all variables in the data step are set to missing at the start of each iteration of the data step.
26.What is the order of evaluation of the comparison operators: + - * / ** ()?
Ans:-(), **, *, /, +, -
27.How could you generate test data with no input data?
Ans:-Using Data Null and put statement
28.How do you debug and test your SAS programs?
Ans:-Using Obs=0 and systems options to trace the program execution in log.
29.What can you learn from the SAS log when debugging?
Ans:-It will display the execution of whole program and the logic. It will also display the error with line number so that you can and edit the program.
30.What is the purpose of _error_?
Ans:-It has only to values, which are 1 for error and 0 for no error.
31.How can you put a "trace" in your program?
Ans:-By using ODS TRACE ON
32.How does SAS handle missing values in: assignment statements, functions, a merge, an update, sort order, formats, PROCs?
Ans:-Missing values will be assigned as missing in Assignment statement. Sort order treats missing as second smallest followed by underscore.
33.How do you test for missing values?
Ans:-Using Subset functions like IF then Else, Where and Select
Welcome SAS Users
My fellow SAS users, thanks for visiting my blog. I hope you like it and please let me know what else you want to see in here by sending me an email.
Join Orkut Community for Latest Updates:
www.sasfuture.blogspot.com
Join Orkut Community for Latest Updates:
www.sasfuture.blogspot.com
Wednesday, March 5, 2008
SAS Interview Q&A:Clinical trials
Subscribe to:
Post Comments (Atom)
2 comments:
Thanks you so much for your nice information about SAS interview and clinical trainings..Awesome post.
your interview questions are very helpful.
but i am confused in answer of the following question:
"3. How would you perform the validation for the listing, which has 400 pages?
Ans:- It is not possible to perform the validation for the listing having 400 pages manually. To do this, we convert the listing in data sets by using PROC RTF and then after that we can compare it by using PROC COMPARE."
what is PROC RTF?
plz reply me on my email:
niravmdarji@gmail.com
Post a Comment