Suppose we are given a set of n i.i.d. samples from a Gaussian with known variance but unknown mean. We wish to estimate the mean. It is well known that the sample mean is an excellent estimator for the true mean.
In the ML reading group this Friday, we will look at a circuit model (or "architecture") called "transformer" that ostensibly underlies recent popular applications such as ChatGPT, Bard and friends.