Functional Magnetic Resonance Imaging (fMRI) is a non-invasive tool used to investigate brain function. The processing of fMRI data consists of multiple steps and the final results often depend greatly on the specific choice of options used: for example, head motion correction, slice timing correction, registration to common space, pre-whitening, hemodynamic response function modelling and multiple comparison correction. As most of these methods were introduced when fMRI was in its infancy, and were initially validated only for small datasets, it is questionable whether the current default methods used in the popular analysis packages are optimal. Despite the huge popularity of fMRI, there have been few studies validating statistical methods. This thesis presents a validation of statistical methods used in task fMRI studies which are related to pre-whitening and to hemodynamic response function modelling. It considers fMRI used with the blood oxygenation level dependent (BOLD) contrast. Firstly, I compared the most frequently used fMRI analysis packages: AFNI, FSL and SPM, with regard to temporal autocorrelation modelling, often known as pre-whitening. I employed eleven datasets containing 980 scans corresponding to different fMRI protocols and subject populations. Though autocorrelation modelling in AFNI was not perfect, its performance was much higher than the performance of autocorrelation modelling in FSL and SPM. The residual autocorrelated noise in FSL and SPM led to heavily confounded first level results, particularly for low-frequency experimental designs. My results show superior performance of SPM's alternative pre-whitening: FAST, over SPM's default algorithm. The reliability of task fMRI studies would increase with more accurate autocorrelation modelling. Furthermore, reliability could increase if the packages provided diagnostic plots. This way the investigator would be aware of pre-whitening problems. Next, I compared - in terms of specificity-sensitivity trade-offs - a number of hemodynamic response function models which are available in AFNI, FSL and SPM. Again, I used different datasets to represent different fMRI protocols and different experimental designs: altogether scans of 772 subjects from five experiments. In contrast to previous studies, I used real data rather than simulations, investigated methods from more than one software package, and employed scans of many subjects. Among other factors, I found that the use of the temporal and dispersion derivatives led to large sensitivity increases compared to the use of the canonical model, but only when the experimental design was event-related and when the statistical inference was based on an F-test which tested the variance explained by canonical function together with the derivatives rather than a t-test which tested the variance explained by the canonical function only. This was the case both for single subject and for group level analyses. Finally, I investigated the effect of ageing on the BOLD signal. For this, I used the Cambridge Centre for Ageing and Neuroscience (CamCAN) data of 641 subjects between 18 and 88 years old. I investigated how the shape of the hemodynamic response function changes with age and whether it is on average similar to the canonical function. The CamCAN task fMRI data enabled the estimation of the hemodynamic response function in the auditory, visual and motor regions. I used the biophysical balloon model to investigate whether values of BOLD-derived physiological parameters vary with age and whether these variations can explain the difference of the hemodynamic response function with age. CamCAN Magnetoencephalography (MEG) data enabled a correlation of the results with neural delay estimates. The hemodynamic response function was found to substantially vary with age, with observed response delays in all considered regions. The estimated balloon model parameters were found to vary with age too. A robustness analysis of the SPM's balloon model revealed serious problems with the current SPM's balloon model estimation procedure. Overall, this thesis presents novel validations of a number of popular statistical methods used in task fMRI studies. I identified several relevant problems related to prewhitening and hemodynamic response function modelling. Importantly, in this thesis I address ways of dealing with such problems so that sensitivity and specificity in task fMRI studies can be improved.