大家用Ai寫左D咩改善自己開工或蛇王的品質?

自己就整左個UserScript改善chat.deepseek的體驗,各位又整左D咩呢?

本帖最後由 hkguy2020 於 2025-3-30 04:36 編輯

我寫緊用 rk3588 rknpu + LangChain custom wrapper 去讀 local rkllm model + RAG 去讀 local vector database, 一齊 return query result (好過只係 query llm). 大部份已 code 完, 行到, performance ok.

成功後攞畀 Flutter app 嚟 call, 跟著做 AI 生成圖 (用 stable diffusion), 回個圖去 Flutter app. 自己玩完, 成熟的話, 睇吓賣唔賣到.

TOP

我寫緊用 rk3588 rknpu + LangChain custom wrapper 去讀 local rkllm model + RAG 去讀 local vector data ...
hkguy2020 發表於 2025/3/30 04:26


好心機慢拼拼圖,不過類似的全棧有PrivateGPT同RAG with Atlas Vector Search, 除做到一啲性能特點,吾系只做RK3588架構真系冇咩吸特別引力,除非你系打包成個項目當教程咁賣

TOP

好心機慢拼拼圖,不過類似的全棧有PrivateGPT同RAG with Atlas Vector Search, 除做到一啲性能特點,吾系只 ...
s20012797 發表於 2025-3-30 14:00


LangChain 去讀 rk3588 rkllm, 其實係個 wrapper, 可以轉用其他 library, 例如 Ollama. Ollama 可以喺 Ras Pi 5 行, 不過要等有 Hailo-H10 先行得快. 當 LangChain wrapper 行得通, RAG part 係 generic. 成 個 AI process 可以 integrate with Flutter app 先重要, 因為依家 mobile app 要有 AI. 我都係以玩為主,

TOP

LangChain 去讀 rk3588 rkllm, 其實係個 wrapper, 可以轉用其他 library, 例如 Ollama. Ollama 可以喺 Ra ...
hkguy2020 發表於 2025/3/30 15:47


老實,家陣比1000左右價執D上兩代的CPU同MB水尾,都系平到得人怕,淘寶更加吾洗講...為ARM而ARM咁玩真係覺得有D搵自己笨咁

TOP

本帖最後由 hkguy2020 於 2025-3-30 17:53 編輯
老實,家陣比1000左右價執D上兩代的CPU同MB水尾,都系平到得人怕,淘寶更加吾洗講...為ARM而ARM咁玩真係覺得 ...
s20012797 發表於 2025-3-30 16:14


用 ARM 唔太難搞, 同 Ras Pi  差不多. 用 Orange Pi 5, 最主要係有 npu. 有 npu, inteference local llm 會快好多. 好多 community 講緊, 好多都唔識搞, 我依家 Orange Pi 嗰 part 搞掂, 無問題, 幾好玩.

From Gemini: You're right, using an Orange Pi 5 with its NPU for local LLM inference is a cost-effective solution. Developing a custom LangChain wrapper to interact with the librkllmrt.so C++ library requires bridging the gap between Python and C++. ......

TOP

用 ARM 唔太難搞, 同 Ras Pi  差不多. 用 Orange Pi 5, 最主要係有 npu. 有 npu, inteference local llm  ...
hkguy2020 發表於 2025/3/30 17:52


https://www.mouser.hk/new/google ... celerator-dual-tpu/
https://www.mouser.hk/ProductDet ... Q6LZJp2eyeh4w%3D%3D

Key Points
Research suggests NPUs are integrated units for edge devices like smartphones, focusing on low power and real-time AI tasks. It seems likely that TPUs are standalone processors for data centers, designed for large-scale AI and deep learning with high performance. The evidence leans toward NPUs being produced by multiple companies, while TPUs are mainly from Google, available via their cloud services. NPUs and TPUs cater to different segments of AI hardware acceleration. NPUs are integrated, low-power solutions for edge devices, while TPUs are standalone, high-performance processors for data centers. This distinction is crucial for selecting the right accelerator based on application needs, with NPUs for mobile efficiency and TPUs for cloud-scale performance.

我估佢CP會比NPU好D  

TOP

Key Points
Research suggests NPUs are integrated units for edge devices like smartphones, focus ...
s20012797 發表於 2025-3-30 20:48


Thank you for information. Coral TPU Dual Edge TPU Module 同 Hailo-8L 係同一類 product. 我上年已買了 Hailo-8L, 發覺用唔在 llm, 只可以用係 imaging, 好多 product 要用過先知道啱唔啱用. 未用過, 就 ...  詳情: 試Raspberry Pi 5 + Hailo AI (同時試 Orange Pi 5, Python AI programming)  

Gemini:

Understanding the Coral TPU's capabilities in the context of local Large Language Models (LLMs) requires some nuance. Here's a breakdown:

Coral TPU's Strengths:

Designed for Inference:
The Coral TPU, including the Dual Edge TPU module, is primarily optimized for accelerating machine learning inference, especially for tasks like image recognition and object detection.  
It excels at running pre-trained models efficiently on edge devices with low power consumption.
TensorFlow Lite Focus:
Coral TPUs are designed to work seamlessly with TensorFlow Lite, a lightweight version of TensorFlow optimized for mobile and embedded devices.  
This means it's best suited for models that can be converted and optimized for TensorFlow Lite.
Edge Processing:
The Coral TPU's strength lies in performing machine learning tasks locally, reducing latency and enhancing privacy.  
Limitations with Large Language Models:

LLM Size and Complexity:
Large Language Models are significantly larger and more complex than the vision models that Coral TPUs typically accelerate.
LLMs often require substantial computational resources, including large amounts of memory and powerful GPUs or TPUs designed for heavy matrix operations.  
TensorFlow Lite Constraints:
While efforts are ongoing to optimize LLMs for edge devices, the current limitations of TensorFlow Lite and the Coral TPU's architecture make it challenging to run very large LLMs efficiently.
LLM's often are not able to be quantized to the degree that the coral tpu requires.
Computational Demands:
LLM's require very large amounts of memory and floating point calculations. The Coral TPU is an integer based processor.  
In summary:

While Coral TPUs are excellent for accelerating specific types of machine learning tasks on edge devices, they are generally not well-suited for running large, complex language models in their entirety.
It might be possible to use Coral TPUs for specific, highly optimized components of an LLM workflow, but full LLM implementations are currently beyond their typical use case.
Therefore, while the Coral TPU Dual Edge TPU module is a powerful tool for edge AI, its application to full, local large language models is very limited by the current state of LLM's, and the architecture of the Coral TPU.

TOP

CORAL TPU DUAL EDGE TPU MODULE vs Orange Pi 5 Max 16gb LPDDR5, in terms of large language model handling power:
附件: 您需要登錄才可以下載或查看附件。沒有帳號?註冊

TOP