We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Research Intern - LLM Quantization

Microsoft
United States, Washington, Redmond
Feb 20, 2025
OverviewResearch Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment.Our team works on performance analysis and optimization of large language models, spanning the stack from GPU kernel implementation through to changes in model architecture. A key challenge is that quantization of models to use smaller data types is only effective if we can dequantize the formats and use them efficiently during computation. In this Research Internship we are going to tackle this problem by exploring the co-design of quantization techniques (e.g., fewer bits per weight) and kernel design for efficient decode (e.g., expanding weights to the 4-bit, 6-bit and 8-bit floating point formats in modern GPUs).
ResponsibilitiesResearch Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world's best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides. During the 12-week internship, Research Interns are paired with mentors and expected to collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community. Research internships are available in all areas of research, and are offered year-round, though they typically begin in the summer.
Applied = 0

(web-7d594f9859-hk764)