Static Analysis Interview Question and Answer

Part 1:

Typical defects discovered during static analysis are?

★ Referencing a variable with undefined value
★ Variables declared but used nowhere
★ Dead code
★ Programming standards and syntax violation
★ Security vulnerabilities

Who leads a walk through?

Author presents the document to audience where a goal can be knowledge sharing or communication purpose.

What is not important goal of a walk through?

Finding defects.

Who leads a formal review process?

Moderator

What is a formal review technique?

Inspection.
Walk through and peer to peer are informal review techniques.

During review meeting, defects are logged by?

During review meeting, author or scribe logs a defect.

Different roles in review are assigned during:

Planning.
Different roles in review are assigned during planning phase so same defects are not found by reviewer.

Entry criteria is determined during which phase?

Planning.
Entry criteria is determined during planning phase where document under review is checked to see whether it fulfills certain standards to ensure that whole review process will not be waste of time if document has too many small mistakes.

What are the benefits of static testing?

★ Early feedback of a quality
★ Less rework cost
★ Increased developmental productivity

Which can be found using static testing techniques?

Defect.
Static testing is method of reviewing a product without executing it so it will find defects. If, we execute the product with defect we encounter a failure.

What is static analysis tools?

It gives quality information about code without executing it.

Most of the time compilers can be used as static analysis tools.

True.
Static analysis tools are an extension of compiler technology so mostly compiler offers static analysis functionalities.

Who generally uses static analysis tools?

Developer.
Static analysis tools are generally used by developer during development and unit testing.

The defects found in static testing and dynamic testing are same.

False.
During static analysis, program is not executed yet so defects such as missing requirements,programming standard violation etc. can be found while during dynamic testing, program is actually executed so failures can be found.

Static analysis is not useful & cost effective way of testing.

False.
Static analysis helps to find defects in documents by reviewing them so defects does not transmit to next phase.

Look at the output on p. 470. Which of the following statements is true of the effect of age and group differences?
The effect of group differences and the effect of age cannot be compared as they are measured differently and represent different variables.
2. The effect of group differences is larger than the effect of age.
3. The effect of group differences and the effect of age are roughly the same.
4. The effect of group differences is smaller than the effect of age.

The effect of group differences is larger than the effect of age.
Look at the output on p. 470. What is the overall effect of the grouping?
0.391
2. 0.882
3. 0.14265
4. 0.609

0.609
Look at the output on p. 470. What is the overall effect of age?

* 0.882
* 0.118
* 0.96933
* <.001

0.882

The adjusted group mean for the DV has been adjusted to standardize it with the other groups based on the grand mean for the covariate.

* True
* False

TRUE

Using the example in the text book (which begins on p. 471), which variable is the covariate in this study?
Errors on a driving simulator.
2. Driving experience.
3. Alcohol condition.
4. Both alcohol level and driving experience.

Driving experience

What would the degrees of freedom be if you were reporting the results of the effect of Age from the output on p. 470?
2, 30
2. 2, 26
3. 1, 29
4. 3, 26

2, 26
Consider the output displayed on p. 470. What is the F-Value associated with the effect of age?
23.091
2. 40.509
3. 71.187
4. 7.133

40.509
In the example in question 12 there are 3 groups to consider. If you found that the groups differed significantly on reading ability, what might you use to further explore these group differences?
You would have to examine partial eta squared to see which of the groups the difference was between.
2. You would have to examine Pearson’s correlations to see which of the groups the difference was between.
3. You would have to examine pair wise correlations to see which of the groups the difference was between.
4. You would have to examine pair wise comparisons to see which of the groups the difference was between.

You would have to examine pair wise comparisons to see which of the groups the difference was between.
You are conducting a study looking at the group differences in reading ability between three groups of children, all receiving different remedial assistance. You decide that ages will co vary with reading ability, so you do ANCOVA.

There are 40 children in each of the three groups. The mean reading score for the whole sample is 42.69. The mean for group 1 is 46.92, for group 2 the mean is 41.89, and for group 3 the mean is 41.05.

What is the grand mean?
3.25
2. 2) 43.29
3. 3) 129.86
4. 4) 42.69

43.29
Refer to the example from question 2 again. If you were conducting your analysis in SPSS, what would the fixed factor be?
Relationship satisfaction.
2. Depression.
3. Relationship satisfaction and depression are both fixed factors.
4. Attachment style.

Attachment style
Consider the hypothetical study presented in question 2. If you were conducting this analysis, what variable would you put into the covariate box?
Depression.
2. Relationship satisfaction.
3. Secure attachment.
4. Attachment style.

Depression
What are the two main reasons for using ANCOVA?
To increase error variance AND to adjust the means on the covariate so that the mean covariate score is the same for all participants.
2. To reduce error variance AND to explore patterns of correlations.
3. To reduce error variance AND to correct the means on the covariate.
4. To reduce error variance AND to adjust the means on the covariate so that the mean covariate score is the same for all groups.

To reduce error variance AND to adjust the means on the covariate so that the mean covariate score is the same for all groups
You are conducting a study. The IV is attachment style. There are three groups of individuals with different attachment styles; these are secure, dismissing, and fearful. You want to explore whether these differ on their scores of relationship satisfaction. The DV is relationship satisfaction. You are aware, however, that relationship satisfaction is known to co vary with depression.

You conduct an ANCOVA with this data. The formula will remove the variance due to the association between which two variables?
Secure attachment and relationship satisfaction.
2. Depression and attachment style.
3. Depression and relationship satisfaction.
4. Attachment style and relationship satisfaction.

Depression and relationship satisfaction
Consider the study in question 2. Which of the below questions would be pertinent to this analysis?
Does relationship satisfaction have a significant effect on the relationship between attachment and depression?
2. What would the mean depression score be for the three groups of attachment styles if their levels of relationship satisfaction were constant?
3. What would the mean relationship satisfaction be if levels of depression were constant?
4. What would the means of the groups be on relationship satisfaction if their levels of depression were constant?

What would the means of the groups be on relationship satisfaction if their levels of depression were constant?
What is a grand mean?
It is the mean of all group means.
2. It is the population mean.
3. It is the total sample mean, controlling for error.
4. It is the total sample mean.

It is the mean of all group means.

Which of the below assumptions must be met in order to conduct ANCOVA?
The covariate should be linearly related to the dependent variable.
2. The regression lines for the different groups must be parallel to each other.
3. The covariate should be measured without error (reliable).
4. All of the above.

All of the above
What problems do you foresee with the study described in question 2?
It is likely that the regression lines will be parallel.
2. It is likely that there will be a linear association between depression and relationship satisfaction.
3. We don’t know how reliably we can measure depression.
4. There could be more than three groups.

We don’t know how reliably we can measure depression.
Which of the below designs would be best suited to ANCOVA?
Participants were placed in four treatment groups for eating disorders. Their cognitive distortions regarding eating and food were measured before treatment, and again after 6 months of intensive treatment.
2. Participants were placed in four treatment groups for eating disorders. Their cognitive distortions regarding eating and food were measured before treatment, and this is used to allocate them to groups. You are exploring whether participants were allocated appropriately.
3. Participants were placed in four treatment groups for eating disorders. You are examining the relationship between cognitive distortions regarding eating and their therapists rating of improvement over a 6 month treatment period.
4. Participants were placed in four treatment groups for eating disorders. Their cognitive distortions regarding eating and food were compared after 6 months of intensive treatment.

Participants were placed in four treatment groups for eating disorders. Their cognitive distortions regarding eating and food were measured before treatment, and again after 6 months of intensive treatment.
When conducting an ANCOVA in SPSS, which function would you select from the analyze drop down list?
General Linear Model.
2. Classify.
3. ANCOVA.
4. Time Series.

General Linear Model
If the assumptions for conducting an ANCOVA are not met, what could you do?
Use ANOVA.
2. Use MANOVA.
3. You could repeat your study and control for the covariate experimentally.
4. Use regression.

You could repeat your study and control for the covariate experimentally.

Part 2 :

List some measure used in the study of static measures?

The measures used in this study are eight:
☛ Number of sentences
☛ Number of atomic conditions per decision
☛ Total number of decisions
☛ Number of equalities
☛ Number of inequalities
☛ Nesting degree
☛ McCabe’s cyclomatic complexity
☛ Branch coverage

What are the kinds of measures?

There are 2 kinds of measures:
Dynamic Measures:
Which requires the execution of the program.
Static Measures:
Which does not require the execution.

What does cyclomatic complexity apply to?

Cyclomatic complexity may also be applied to individual functions, modules, methods or classes within a program, and is formally defined as follows:
v(G) = E − N + 2P
where E is the number of edges of the graph, N is the number of nodes of the graph and P is the number of connected components.

How is cyclomatic complexity computed?

Cyclomatic complexity is computed using the control flow graph of the program, the nodes of the graph correspond to indivisible groups of sentences of a program and a directed edge connects two nodes if the second sentence might be executed immediately after the first sentence.

What is a cyclomatic complexity?

Cyclomatic complexity is a complexity measure of code related to the number of ways there exists to traverse a piece of code. This measure determines the minimum number of test cases needed to test all the paths using linearly independent circuits.

What is statement coverage?

Statement coverage is defined as the percentage of statements that are executed.

What is a branch coverage?

branch coverage is the percentage of branches exercised in a program. This coverage measure is used in most of the related papers in the literature.

What is a nesting degree?

The nesting degree is the maximum number of conditional statements that are nested one inside another.

What is number of inequalities?

The number of (in)equalities is the number of times that the operator is found in atomic conditions of a program.

What are quantitative models?

Quantitative models are frequently used in different engineering disciplines for predicting situations, due dates, required cost and so on. These quantitative models are based on some kinds of measure performed on project data or items.

Should you use a one-tailed, or a two-tailed hypothesis when doing a chi square test?
Two-tailed.
2. It doesn’t matter.
3. One-tailed.
4. SPSS will include the right one in the output.

Answer: It doesn’t matter.
One serious complication associated with the analysis of more than three levels (4 x 5) is?
The sample size would have to be so large that chi square analysis would not be powerful enough to interpret the data.
2. It can be difficult to interpret accurately all of the relationships within a large contingency table.
3. That type of analysis would not meet the criteria for a chi square test.
4. They would have to be analyzed by hand as SPSS has no option for tables larger than 3×3.

Answer: it can be difficult to interpret accurately all of the relationships within a large contingency table.
Data for a chi square test should be assumed to have no less than one participant per cell. If there is less than one participant per cell, it is sometimes useful to combine cells together into one category?

* True
* False

Answer: TRUE

Refer back to the example in question 14 which look at submission of essays and time planning. In terms of essay submission, number of early, late and on time students are counted. The number of students who planned their time was also counted, leading to two levels of time planning or not. How would this analysis be described?
3×2
2. 3×3
3. 2×2
4. 3×1

Answer: 3×2
For a 2×2 chi square test, which of the following equations would be used to calculate the degrees of freedom?
(r-1) x (c-1) x (n-1)
2. (r-c) x (n-1)
3. (r-1) x (c-1)
4. (r+1) x (c+1)

Answer: r-1) x (c-1)
What does the Fishers Exact Probability test show?
Fisher’s Exact Probability Test shows the F statistic associated with the chi square value when the null is assumed to be true.
2. The Fisher’s Exact Probability Test shows the probability of reaching the assumption of 25% of cells with an expected frequency of less than 5.
3. The Fisher’s Exact Probability Test shows the percentage of variation which one variable accounts for in the other.
4. The Fisher’s Exact Probability Test shows the probability of obtaining the chi square value when the null is assumed to be true.

Answer: The Fisher’s Exact Probability Test shows the probability of obtaining the chi square value when the null is assumed to be true.
You conduct a study exploring whether or not students planned their time and whether or not they submitted their assignment on time, your SPSS output shows a value for Cramers V of 0.42. How would you interpret this?
8% of the variation in frequency counts of essay submission timing (on time or late) can be explained by time planning.
2. 42% of the variation in frequency counts of essay submission timing (on time or late) can be explained by time planning.
3. 64.8% of the variation in frequency counts of essay submission timing (on time or late) can be explained by time planning.
4. 4.2% of the variation in frequency counts of essay submission timing (on time or late) can be explained by time planning.

Answer: 8% of the variation in frequency counts of essay submission timing (on time or late) can be explained by time planning.
What is Cramers V used for?
Cramer’s V is used instead of X2 in analyses which are bigger than 2×2.
2. Cramer’s V is used when assumptions for conducting chi square are violated.
3. Cramer’s V is a measure of effect used for tests of association.
4. Cramer’s V is a way of reporting the ratio between the observed and expected scores.

Answer: Cramer’s V is a measure of effect used for tests of association.
When reporting your results, what elements should you include from the SPSS output?
The number of participants, the X2 value, and the probability level
2. The Pearson’s X2, degrees of freedom and the probability level.
3. The number of participants, the degrees of freedom, X2, and the probability level
4. The X2 and the probability level.

Answer: The Pearson’s X2, degrees of freedom and the probability level.
If the assumption mentioned in question 10 is not met for a 2×2 chi square test, you should proceed to conducting _________?
A Pearson’s correlation coefficient
2. A 2×2 test of independence
3. One variable chi square test (goodness of fit)
4. A Fisher’s Exact Probability Test

Answer: a Fisher’s Exact Probability Test

A fundamental assumption of chi square tests is that no more than ____ % of cells can have an expected frequency of less than?
25; 5
2. 75; 95
3. 4; 1
4. 5; n

Answer: 25; 5
Which of the below statements is false of chi square testing?
Chi square tests can be used to check how well a model fits the data
2. Chi square can be applied to continuous variables; it just means that a larger contingency table is needed.
3. Chi square is used in research to measure the association between two categorical variables.
4. None of these statements are false, it is a trick question.

Answer: Chi square can be applied to continuous variables; it just means that a larger contingency table is needed.
Although in one variable chi square testing each participant cannot be in more than one group, in a 2×2 chi square test, this rule does not apply?

* True
* False

Answer: FALSE

Examine the output on p. 268. How would these results be reported?
The chi square value of 10.490 (DF=317) achieved an associated p value of <.001. There was a significant difference between the expected and the observed frequencies. We can conclude that there is a greater prevalence of right handedness in women with IBS.
2. The chi square value of 317 (DF=1) achieved an associated p value of <.001. There was a significant difference between the expected and the observed frequencies. We can conclude that there is a greater prevalence of left handedness in women with IBS.
3. The chi square value of 10.490 (DF=1) achieved an associated p value of .001. There was no significant difference between the expected and the observed frequencies. We can conclude that being left or right handed is unrelated to IBS in women.
4. The chi square value of 10.490 (DF=1) achieved an associated p value of <.001. There was a significant difference between the expected and the observed frequencies. We can conclude that there is a greater prevalence of left handedness in women with IBS.

Answer: The chi square value of 10.490 (DF=1) achieved an associated p value of <.001. There was a significant difference between the expected and the observed frequencies. We can conclude that there is a greater prevalence of left handedness in women with IBS.
You are conducting a one variable chi square test to test the hypothesis that there are equal numbers of vegetarians, meat eaters, and vegans eating at the student union. The categories are vegetarian, meat eaters, and vegans. Having conducted a survey, you found 85 individuals were vegetarian, 122 ate meat, and 32 followed a vegan diet. What would the expected frequencies be in each cell?
239
2. 85, 122, and 32.
3. 79.67
4. There is insufficient information provided to calculate the expected frequencies.

Answer: 79.67
How do we calculate the degrees of freedom for a goodness of fit test?
Number of categories -1.
2. Number of categories x n.
3. N/ (Number of categories-1).
4. n-1.

Answer: Number of categories -1.
Which of the following hypotheses would be suited for testing by a one variable chi square test?
It is hypothesized that in terms of car color, more individuals choose a red car, than a green, a black, or a silver car.
2. Choice of car color is directly related to measures of extroversion.
3. Individuals with red cars are significantly more extroverted than are individuals with green, black or silver cars.
4. None of the above

Answer: It is hypothesized that in terms of car color, more individuals choose a red car, than a green, a black, or a silver car.
Using a goodness of fit we can test whether a set of obtained frequencies differ from a set of ______ frequencies?
Expected
2. Observed
3. Constant
4. Independent

Answer: expected
What sort of data is appropriate for chi square tests?
Scaled scores.
2. Rank ordered data.
3. Continuous scores.
4. Frequency counts

Answer: Frequency counts

Part 3 :

What is linear regression?

Modeling the relationship between a scalar variable y and one or more variables denoted X. In linear regression, models of the unknown parameters are estimated from the data using linear functions.
polyfit( x,y2,1) %return 2.1667 -1.3333, i.e 2.1667x-1.3333

What is null hypothesis?

The null hypothesis (denote by H0 ) is a statement about the value of
a population parameter (such as mean), and it must contain the condition of equality and must be written with the symbol =, ≤, or ≤.

Explain central limit theorem?

As the sample size increases, the sampling distribution of sample
means approaches a normal distribution
If all possible random samples of size n are selected from a population with mean μ and standard deviation σ, the mean of the sample means is denoted by μ x̄ , so
μ x̄ = μ
the standard deviation of the sample means is:
σ x̄ = σ⁄√ n

Explain hash table?

A hash table is a data structure used to implement an associative array, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.

Do you know what is binary search?

For binary search, the array should be arranged in ascending or descending order. In each step, the algorithm compares the search key value with the key value of the middle element of the array. If the keys match, then a matching element has been found and its index, or position, is returned. Otherwise, if the search key is less than the middle element’s key, then the algorithm repeats its action on the sub-array to the left of the middle element or, if the search key is greater, on the sub-array to the right.

What is binomial probability formula?

P(x)= p x q n-x n!/[(n-x)!x!]
where n = number of trials
x = number of successes among n trials
p = probability of success in any one trial
q = 1 -p

Give example of Central Limit Theorem?

Given that the population of men has normally distributed weights, with a mean of 173 lb and a standard deviation of 30 lb, find the probability that
a. if 1 man is randomly selected, his weight is greater than 180 lb.
b. if 36 different men are randomly selected, their mean weight is greater that 180 lb.

Solution: a) z = (x – μ)/ σ = (180-173)/30 = 0.23
For normal distribution P(Z>0.23) = 0.4090
b) σ x̄ = σ/√n = 20/√ 36 = 5
z= (180-173)/5 = 1.40
P(Z>1.4) = 0.0808

What is significance level?

The probability of rejecting the null hypothesis when it is called
the significance level α , and very common choices are
α = 0.05 and α = 0.01

What is alternative hypothesis?

The Alternative hypothesis (denoted by H1 ) is the statement that must be true if the null hypothesis is false.

What is one sample t-test?

T-test is any statistical hypothesis test in which the test statistic follows a Student’s t distribution if the null hypothesis is supported.
[h,p,ci] = ttest(y2,0)% return 1 0.0018 ci =2.6280 7.0863

What is covariance?

Measure of how much two variables change together
y2=[1 3 4 5 6 7 8]
cov(x,y2) %return 2*2 matrix, diagonal represents variance

What is moment?

Quantitative measure of the shape of a set of points.
moment(x, 2); %return second moment

What is kurtosis?

Kurtosis is a measure of how outlier-prone a distribution is.
kurtosis(x) % return2.3594

What is variance?

Describes how far values lie from the mean
var(x) %return 1.1429

What is skewness?

Skewness is a measure of the asymmetry of the data around the sample mean. If skewness is negative, the data are spread out more to the left of the mean than to the right. If skewness is positive, the data are spread out more to the right.
Skewness(x) % return-0.5954

What is quartile?

► first quartile (25th percentile)
► second quartile (50th percentile)
► third quartile (75th percentile)
► kth percentile
► prctile(x, 25) % 25th percentile, return 2.25
► prctile(x, 50) % 50th percentile, return 3, i.e. median

What is median?

Median is described as the numeric value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking the middle one
median(x) % return 3.

What is mode?

The mode of a data sample is the element that occurs most often in the collection.
x=[1 2 3 3 3 4 4]
mode(x) % return 3, happen most

What are sampling methods?

There are four sampling methods:
► Simple Random (purely random),
► Systematic( every kth member of population),
► Cluster (population divided into groups or clusters)
► Stratified (divided by exclusive groups or strata, sample from each group) samplings.

What is sampling?

Sampling is that part of statistical practice concerned with the selection of an unbiased or random subset of individual observations within a population of individuals intended to yield some knowledge about the population of concern.

Give an example of p-value?

Suppose that the experimental results show the coin turning up heads 14 times out of 20 total flips
► null hypothesis (H0): fair coin;
► observation O: 14 heads out of 20 flips; and
► p-value of observation O given H0 = Prob(≥ 14 heads or ≥ 14 tails) = 0.115.
The calculated p-value exceeds 0.05, so the observation is consistent with the null hypothesis – that the observed result of 14 heads out of 20 flips can be ascribed to chance alone – as it falls within the range of what would happen 95% of the time were this in fact the case. In our example, we fail to reject the null hypothesis at the 5% level. Although the coin did not fall evenly, the deviation from expected outcome is small enough to be reported as being “not statistically significant at the 5% level”.

What is p-value?

In statistical significance testing, the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. If the p-value is less than 0.05 or 0.01, corresponding respectively to a 5% or 1% chance of rejecting the null hypothesis when it is true

What is likelihood?

The probability of some observed outcomes given a set of parameter values is regarded as the likelihood of the set of parameter values given the observed outcomes.

What is frequentist?

Frequentists condition on a hypothesis of choice and consider the probability distribution on the data, whether observed or not.

What is bayesian?

Bayesians condition on the data actually observed and consider the probability distribution on the hypotheses.

When you are creating a statistical model how do you prevent over-fitting?

cross-validation

Statistics Job Interview Preparation Questions Part Five!

What is meant by human security?
What is Max Weber theory?
What does the theory describe?
What is the term social stratification defines?
How egotistic suicide helps the society in getting rid of the people who are not willing to live?
How many relationship that exist within a culture?
What are the problems faced by people due to unemployment?
What is the theory of dual burden?
What are the effective measures taken towards racial discrimination?
What are the indicators used to show the social development?

Statistics Job Interview Preparation Questions Part Four!

How psychology different from sociology?
What are the traits involved in social reforms?
What are the different components that are required to create a society’s culture?
What is the difference between social change and development?
What are the different stereotypes used to define group relations?
What are the laws required by civilization?
Define anticipatory socialization?
What is the “conflict theory” in sociology?
How does art and design puts an affect on different cultures?
What are the various branches that exist in sociology?

Statistics Job Interview Preparation Questions Part Three!

What are the different types of deviance that exist?
What is the difference between tertiary and secondary deviance?
What are the different areas of sociology?
What is the difference between urban and rural community?
How racism can be abolished in the society?
How in different ways patriotism can be shown?
What are the different types of agents present in socialization?
What is the meaning of incest?
What is the function of incest?
How cultural diversity can be reduced around different culture?

Statistics Job Interview Preparation Questions Part Two!

What are the different types of story that is defined in sociology?
What are the main functions of formalism?
How to manage the problems occurring in contemporary culture?
What is hegemony?
What is the difference between adaptive and real culture?
What is the theory of Non Symbolic interactionism?
What are the different agencies of socializations?
What is the difference between appropriate and inappropriate behavior?
What are the different principles involved in natural science?
What are cultural traits?

Statistics Job Interview Preparation Questions Part One!
What are the factors that changed the role of women in today’s society?
2. What are the factors involved in influencing the crime?
3. What is the purpose of interpersonal communication?
4. What are the different types of research possible?
5. What is the difference between subculture and counterculture?
6. What kind of impact is being given by social devaluation?
7. What are the different components of culture?
8. How social relations affect the individual relationship with one another?
9. What are the disadvantages of living in counter culture?
10. What are the disadvantages of having too much freedom?