Stephanie Cooper

(formerly Stephanie Moreaud)

Innovative Computing Laboratory
University of Tennessee

e-mail :


I recently finished my service as a post doctoral researcher at the Innovative Computing Laboratory in the University of Tennessee.
I did my PhD in the INRIA Runtime team of LaBRI (University of Bordeaux, France), under the direction of Raymond Namyst and Brice Goglin.

Research activities

My primary research interests are data transfers on hierarchical computing machines in the context of HPC. I study the impact of the physical task locations on the performance of communications according to the hardware topology ( Non Uniform Input/Output Access, NUMA, multicore, hierarchy of shared caches, ...).
I look for ways to improve data transfer performance by adapting task placement to the communication scheme or transfer-strategies to a predefined binding of processes.

A list of my publications
My resume

Teaching activities

I have been teaching at the Computer Science Department of the University of Bordeaux (France) between 2007 and 2011.

2010-2011 Teaching Assistant at the University of Bordeaux, Computer Science Department

2009-2010 Teaching Assistant at the University of Bordeaux, Computer Science Department

2008-2009 : Teaching Assistant at the University of Bordeaux, Computer Science Department

2007-2008 Teaching Assistant at the University of Bordeaux, Computer Science Department.

The course details and associated documents are available here (in French).
Vous trouverez ici le detail de mes enseignements et les documents associés.



  • B. Goglin and S. Moreaud. KNEM: a Generic and Scalable Kernel-Assisted Intra-node MPI Communication Framework. Journal of Parallel and Distributed Computing (JPDC) , 73(2):176-188, Feb. 2013.

Refereed conferences and workshops

  • F. Broquedis, J. Clet-Ortega, S. Moreaud, N. Furmento, B. Goglin, G. Mercier, S. Thibault, and R. Namyst. hwloc: a Generic Framework for Managing Hardware Affinities in HPC Applications. In 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP2010), Pisa, Italia, Feb. 2010. IEEE Computer Society Press.
  • D. Buntinas, B. Goglin, D. Goodell, G. Mercier, and S. Moreaud. Cache-Efficient, Intranode Large-Message MPI Communication with MPICH2-Nemesis. In 38th International Conference on Parallel Processing (ICPP-2009), Vienna, Austria, Sep. 2009. IEEE Computer Society Press.
  • B. Goglin and S. Moreaud. Dodging Non-Uniform I/O Access in Hierarchical Collective Operations for Multicore Clusters. In CASS 2011: The 1st Workshop on Communication Architecture for Scalable Systems, held in conjunction with IPDPS 2011, Anchorage, AK, May 2011. IEEE Computer Society Press.
  • S. Moreaud. Impacts des effets NUMA sur les communications haute performance dans les grappes de calcul. In 18ème Rencontres Francophones du Parallélisme (RenPar08), Fribourg, Switzerland, Feb. 2008.
  • S. Moreaud. Adaptation des communications MPI intra-nœud aux architectures multicœurs modernes. In 19ème Rencontres Francophones du Parallélisme (RenPar09), Toulouse, France, Sep. 2009.
  • S. Moreaud and B. Goglin. Impact of NUMA Effects on High-Speed Networking with Multi-Opteron Machines. In 19th IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS 2007), Cambridge, Massachussetts, Nov. 2007.
  • S. Moreaud, B. Goglin, D. Goodell, and R. Namyst. Optimizing MPI Communication within large Multicore nodes with Kernel assistance. In CAC 2010: The 10th Workshop on Communication Architecture for Clusters, held in conjunction with IPDPS 2010, Atlanta, GA, Apr. 2010. IEEE Computer Society Press.
  • S. Moreaud, B. Goglin, and R. Namyst. Adaptive MPI Multirail Tuning for Non-Uniform Input/Output Access. In 17th EuroMPI, Stuttgart, Germany, Sep. 2010.


  • S. Moreaud. Impact des architectures multiprocesseurs sur les communications dans les grappes de calcul : de l'exploration des effets NUMA au placement automatique. Master thesis, Univ. of Bordeaux, June 2007.
  • S. Moreaud. Mouvement de données et placement des tâches pour les communications haute performance sur machines hiérarchiques. PhD thesis, University of Bordeaux, 2011.