Shamar@feddit.it • edit-211 days agoA community statement supporting the Open Source Definition (OSD)plus-squareexternal-linkmessage-square0 fedilinkarrow-up16
arrow-up16external-linkA community statement supporting the Open Source Definition (OSD)plus-squareShamar@feddit.it • edit-211 days agomessage-square0 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 2 months agoHow ‘Embeddings’ Encode What Words Meanplus-squareexternal-linkmessage-square0 fedilinkarrow-up17
arrow-up17external-linkHow ‘Embeddings’ Encode What Words Meanplus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 2 months agomessage-square0 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 2 months agoNew AI model “learns” how to simulate Super Mario Bros. from video footageplus-squareexternal-linkmessage-square0 fedilinkarrow-up13
arrow-up13external-linkNew AI model “learns” how to simulate Super Mario Bros. from video footageplus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 2 months agomessage-square0 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 2 months agoReflection 70B holds its own against even the top closed-source models (Claude 3.5 Sonnet, GPT-4o)plus-squareexternal-linkmessage-square0 fedilinkarrow-up18
arrow-up18external-linkReflection 70B holds its own against even the top closed-source models (Claude 3.5 Sonnet, GPT-4o)plus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 2 months agomessage-square0 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 2 months agoIt’s Not Intelligent If It Always Halts: A Critical Perspective on Current Approaches to AGIplus-squareexternal-linkmessage-square2 fedilinkarrow-up18
arrow-up18external-linkIt’s Not Intelligent If It Always Halts: A Critical Perspective on Current Approaches to AGIplus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 2 months agomessage-square2 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 2 months agoThe Difference Between Speaking and Thinkingplus-squareexternal-linkmessage-square0 fedilinkarrow-up14
arrow-up14external-linkThe Difference Between Speaking and Thinkingplus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 2 months agomessage-square0 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 2 months agoDiffusion Models Are Real-Time Game Enginesplus-squareexternal-linkmessage-square0 fedilinkarrow-up14
arrow-up14external-linkDiffusion Models Are Real-Time Game Enginesplus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 2 months agomessage-square0 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 3 months agoLiger Kernel is a collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduces memory usage by 60%.plus-squareexternal-linkmessage-square0 fedilinkarrow-up12
arrow-up12external-linkLiger Kernel is a collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduces memory usage by 60%.plus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 3 months agomessage-square0 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 3 months agoTransformer Explainerplus-squareexternal-linkmessage-square0 fedilinkarrow-up12
arrow-up12external-linkTransformer Explainerplus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 3 months agomessage-square0 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 3 months agoAlibaba claims no. 1 spot in AI math models with Qwen2-Mathexternal-linkmessage-square0 fedilinkarrow-up19
arrow-up19external-linkAlibaba claims no. 1 spot in AI math models with Qwen2-Math☆ Yσɠƚԋσʂ ☆@lemmy.ml • 3 months agomessage-square0 Commentsfedilink
yboutros@infosec.pub • 3 months agoHow to convert a positionally encoded predicted embedding from a decoder to its matching token?plus-squaremessage-squaremessage-square2 fedilinkarrow-up12
arrow-up12message-squareHow to convert a positionally encoded predicted embedding from a decoder to its matching token?plus-squareyboutros@infosec.pub • 3 months agomessage-square2 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 3 months agoNew Open-Source AI Image Generator Beats Midjourney, SD3 and Auraflowplus-squareexternal-linkmessage-square0 fedilinkarrow-up15
arrow-up15external-linkNew Open-Source AI Image Generator Beats Midjourney, SD3 and Auraflowplus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 3 months agomessage-square0 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 4 months agoAI models collapse when trained on recursively generated dataplus-squareexternal-linkmessage-square3 fedilinkarrow-up115
arrow-up115external-linkAI models collapse when trained on recursively generated dataplus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 4 months agomessage-square3 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 4 months agoRouteLLM: An Open-Source Framework for Cost-Effective LLM Routingplus-squareexternal-linkmessage-square0 fedilinkarrow-up14
arrow-up14external-linkRouteLLM: An Open-Source Framework for Cost-Effective LLM Routingplus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 4 months agomessage-square0 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 4 months agoAlibaba's Qwen LLM model leading open source rankingsplus-squareexternal-linkmessage-square0 fedilinkarrow-up14
arrow-up14external-linkAlibaba's Qwen LLM model leading open source rankingsplus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 4 months agomessage-square0 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • edit-25 months agoBy using the same techniques Google used to solve Go (MTCS and backprop), Llama8B gets 96.7% on math benchmark GSM8K. That’s better than GPT-4, Claude and Gemini, with 200x fewer parameters!plus-squareexternal-linkmessage-square0 fedilinkarrow-up17
arrow-up17external-linkBy using the same techniques Google used to solve Go (MTCS and backprop), Llama8B gets 96.7% on math benchmark GSM8K. That’s better than GPT-4, Claude and Gemini, with 200x fewer parameters!plus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • edit-25 months agomessage-square0 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 5 months agoMixture of Agents (MoA) leverages several open-source LLM agents to achieve a score of 65.1% on AlpacaEval 2.0plus-squareexternal-linkmessage-square0 fedilinkarrow-up14
arrow-up14external-linkMixture of Agents (MoA) leverages several open-source LLM agents to achieve a score of 65.1% on AlpacaEval 2.0plus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 5 months agomessage-square0 Commentsfedilink
ylai@lemmy.ml • 5 months agoFrom DeepSpeed to FSDP and Back Again with Hugging Face Accelerateplus-squareexternal-linkmessage-square0 fedilinkarrow-up11
arrow-up11external-linkFrom DeepSpeed to FSDP and Back Again with Hugging Face Accelerateplus-squareylai@lemmy.ml • 5 months agomessage-square0 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 6 months agoSakuga-42M Dataset: Scaling Up Cartoon Researchplus-squareexternal-linkmessage-square0 fedilinkarrow-up13
arrow-up13external-linkSakuga-42M Dataset: Scaling Up Cartoon Researchplus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 6 months agomessage-square0 Commentsfedilink
☆ Yσɠƚԋσʂ ☆@lemmy.ml • 6 months agoHow AI 'Understands' Images (CLIP)plus-squareexternal-linkmessage-square0 fedilinkarrow-up14
arrow-up14external-linkHow AI 'Understands' Images (CLIP)plus-square☆ Yσɠƚԋσʂ ☆@lemmy.ml • 6 months agomessage-square0 Commentsfedilink