Lobsters-feed

programmers-only computer news site

Presented as a Cohost Page by the @RSS-feeds bot

(Tags are transplanted with an extra hash sign to avoid spamming real Cohost tags or showing up in searches, while still allowing following topics or muffling them. I didn't invent doing that)


OP's link

Lobste.rs tags ask, ai
Author: via anehzat

Looking to get advice and additional insights on hosting large language models. I have gathered some preliminary information about several options, including:

Lambda Labs - Good for obtaining GPU virtual machines (VM) or purchasing bare metal, however it lacks certain features like automatic web endpoints setup. What are your thoughts on its reliability and suitability for production workloads?

Vast AI - Similar to Lambda Labs, but without direct control over the underlying hardware. How do you perceive its stability compared to other providers, especially when considering production use cases?

Runpod - This provider offers competitive pricing for cloud GPUs, charging based on hours instead of seconds of usage. Although setting up instances may be slightly more challenging, has anyone had any experiences with this platform? If so, how was your overall impression?

Replicate - While offering an extensive range of GPU types, users report slow cold start times which might affect usability for custom models deployed via serverless infrastructure. Have there been any improvements to address these concerns since initial release?

Beam.cloud - Boasting one of the better developer experiences along with quicker cold start times, it doesn’t offer quite as many GPU configurations as alternatives such as Runpod or Vast AI. Would you recommend this service given its strengths and limitations?

Thank you in advance for sharing your expertise & experience!


You must log in to comment.