Heteroscedastic Gaussian processes (HGPs) are kernel-based , non-parametric models that can be used to infer nonlinear functions with time-varying noise. In robotics , they can be employed for learning from demonstration as motion primitives , i.e. as a model of the trajectories to be executed by the robot. HGPs provide variance estimates around the reference signal modeling the trajectory , capturing both the predictive uncertainty and the motion variability. However , similarly to standard Gaussian processes they suffer from a cubic complexity in the number of training points , due to the inversion of the kernel matrix. The uncertainty can be leveraged for more complex learning tasks , such as inferring the variable impedance profile required from a robotic manipulator. However , suitable approximations are needed to make HGPs scalable , at the price of potentially worsening the posterior mean and variance profiles. Motivated by these observations , we study the combination of HGPs and random features , which are a popular , data-independent approximation strategy of kernel functions. In a theoretical analysis , we provide novel guarantees on the approximation error of the HGP posterior due to random features. Moreover , we validate this scalable motion primitive on real robot data , related to the problem of variable impedance learning. In this way , we show that random features offer a viable and theoretically sound alternative for speeding up the trajectory processing , without sacrificing accuracy.