There is a demonstrable usefulness of the focus on the neighbourhood scale promoted in the X-minute city vision, as part of a larger climate change mitigation strategy. However, emphasis on the neighbourhood runs the risk of over-simplifying the processes by which critical urban functions are created. Many urban functions depend on the interplay between spatial scales that needs to be addressed in the X-minute city vision. Here, we provide examples of how multi- and cross-scale interactions play an important role into the generation of three critical urban functions: ensuring food security, adapting to climate change and reviving public life.