Sorry for the confusion. We used a different resource when writing this question, however, this is a good example of some differences in interpretation for levels of evidence. A non-randomized cohort design is sometimes considered a lower quality level II, however, for the purposes of this question, it is a higher level than the others (so is the correct answer). There is a cohort (not single subjects) and at least 2 groups, since the design is non-randomized (group assignment is not randomized). This makes for a stronger design than a case series as there is a comparison group.
Literature reviews are merely summaries of what is published on the topic; a systematic review or a meta-analysis, however, are stronger designs if they analyze and interpret data from multiple studies.
Case studies are usually level V, unless they have been grouped and analyzed as a series.
A multiple baseline design is a single subject research design where the subject serves as it’s own control. The researcher measures the baseline status of a trait of interest (such as walking speed), then applies a treatment before measuring that trait again. A stronger design will repeat this process and have multiple measurements along a continuum of intervention and no intervention periods. The term “multiple baselines” refers to the multiple measurements of the trait taken along the course of the study. If the exact design is used across 3 or more subjects and the data is analyzed collectively, the strength of the study increases.