Talent.com
Research Intern – Multimodal Foundation Model for Vision
Research Intern – Multimodal Foundation Model for VisionSony UK Technology Centre • Remote, Washington D.C.
Research Intern – Multimodal Foundation Model for Vision

Research Intern – Multimodal Foundation Model for Vision

Sony UK Technology Centre • Remote, Washington D.C.
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.part_time]
  • [filters.remote]
[job_card.job_description]

Research Intern –MultimodalFoundation Modelfor Vision

Sony AI is seeking research interns to joinus. Our teammainly focuseson fundamental and applied research, with a focus on buildingnext-generationfoundation modelsforvisionin a responsible manner. The role of a research intern is to developefficient and effectivemethodologiesand prototype solutions. You will work with aproductiveteam of world-class scientists and engineers to tackle the most challenging problems in foundation models and generative AI, includinglow-cost yetpowerfulvision foundation models(VFM), vision-languagemodels(VLM),unified models,automaticmodel compression, optimization anddeployementoncloud andedge. You will see your ideas not only published inpapers, butalso improve the experience ofbillionsofcustomers.

Roles and Responsibilities

  • Conduct fundamental and innovativedevelopmentinlow-cost yetpowerfulvision-languagemodels(VLM),unified models,automaticmodel compression, optimization anddeployementoncloud andedge.

  • Design or implementstate-of-the-arttechs onmodel compression, inference speedup,deployementonharwares, tool automation.

  • PoC forvariousvision+text,generationrelevanttasks(VQA, captioning,understanding,etc) andhardwares.

  • Contribute to library and tool development to support business; orPublishinfluential research in top-tier conferences and journals.

Required Qualifications and Skills

  • Currentlyhas, or isin the process of obtaining, a master/PhD degree in computer science or related field.

  • Beveryself-motivated and capable of proposing and implementing innovative ideas.

  • Solidpresentationand communication skills to internal and external audiences.

  • Publications orexpertiseincompact foundation modeldevelopment anddeployment.Influential open-source projects orpaperpublicationattopconferences, e.g., CVPR, ICCV, ECCV,NeurIPS, ICML,ACL,etc.

  • Better to have front-end development experience.

  • Solid coding skills inPython,Pytorch, etc.

Working Location

Location flexible (Tokyo,Europe ,US)

The target hourly rate for this internship is $50.00per hour. The individual will be paid hourly and eligible for overtime.

#LI-AS1

All qualified applicants will receive consideration for employment without regard to any basis protected by applicable federal, state, or local law, ordinance, or regulation.

Disability Accommodation for Applicants to Sony Corporation of America

Sony Corporation of America provides reasonable accommodation for qualified individuals with disabilities and disabled veterans in job application procedures. For reasonable accommodation requests, please contact us by email at or by mail to: Sony Corporation of America, Human Resources Department, 25 Madison Avenue, New York, NY 10010. Please indicate the position you are applying for.

Right to Work (English/Spanish)

E-Verify Participation (English/Spanish)

[job_alerts.create_a_job]

Research Intern – Multimodal Foundation Model for Vision • Remote, Washington D.C.