ChatRWKV is similar to ChatGPT but powered by RWKV (100% RNN) language model and is open source.I hope to do “Stable Diffusion of large-scale language models”.
At present, RWKV has a large number of models, corresponding to various scenarios and various languages:
- Raven model: suitable for direct chat, suitable for +i command. There are many language versions, see which one to use. Suitable for chatting, completing tasks, and writing code. It can be used as a task to write manuscripts, outlines, stories, poems, etc., but the writing style is not as good as the testNovel series models.
- Novel-ChnEng model: Chinese and English novel models, you can use +gen to generate world settings (if you can write prompt, you can control the plot and characters below), you can write science fiction and fantasy. Not suitable for chat, not suitable for +i command.
- Novel-Chn model: pure Chinese web text model, only +gen can be used to continue writing web text (cannot generate world settings, etc.), but it is better to write web text (also smaller white text, suitable for writing male and female frequency ). Not suitable for chat, not suitable for +i command.
- Novel-ChnEng-ChnPro model: Fine-tune ChnEng-ChnPro on high-quality works (classic, sci-fi, fantasy, classic, translation, etc.).
HuggingFace Gradio Demo (14B ctx8192):https://huggingface.co/spaces/BlinkDL/ChatRWKV-gradio
Raven (7B fine-tuning on Alpaca etc.) Demo:https://huggingface.co/spaces/BlinkDL/Raven-RWKV-7B
RWKV pip package:https://pypi.org/project/rwkv/
Update ChatRWKV v2 and pip rwkv package (0.7.3):
Use v2/convert_model.py to convert models to policies for faster loading and CPU memory savings.
### Note RWKV_CUDA_ON will build a CUDA kernel ("pip install ninja" first).
### How to build in Linux: set these and run v2/chat.py
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
### How to build in win:
Install VS2022 build tools (https://aka.ms/vs/17/release/vs_BuildTools.exe select Desktop C++). Reinstall CUDA 11.7 (install VC++ extensions). Run v2/chat.py in "x64 native tools command prompt".
Download RWKV-4 weights: https://huggingface.co/BlinkDL(Use the RWKV-4 model. Do not use RWKV-4a and RWKV-4b models. )
RWKV Discord https://discord.gg/bDSBUMeFpc
Twitter: https://twitter.com/BlinkDL_AI
RWKV LM: https://github.com/BlinkDL/RWKV-LM (Interpretation, fine-tuning, training, etc.)
RWKV in 150 lines (Models, Inference, Text Generation): https://github.com/BlinkDL/ChatRWKV/blob/main/RWKV_in_150_lines.py
ChatRWKV v2: Has “stream” and “split” strategies, and INT8.3G VRAM is enough to run RWKV 14B https://github.com/BlinkDL/ChatRWKV/tree/main/v2
os.environ["RWKV_JIT_ON"] = '1' os.environ["RWKV_CUDA_ON"] = '0' # if '1' then use CUDA kernel for seq mode (much faster) from rwkv.model import RWKV # pip install rwkv model = RWKV(model='/fsx/BlinkDL/HF-MODEL/rwkv-4-pile-1b5/RWKV-4-Pile-1B5-20220903-8040', strategy='cuda fp16') out, state = model.forward([187, 510, 1563, 310, 247], None) # use 20B_tokenizer.json print(out.detach().cpu().numpy()) # get logits out, state = model.forward([187, 510], None) out, state = model.forward([1563], state) # RNN has state (use deepcopy if you want to clone it) out, state = model.forward([310, 247], state) print(out.detach().cpu().numpy()) # same result as above
https://huggingface.co/BlinkDL/rwkv-4-raven/blob/main/RWKV-4-Raven-14B-v7-Eng-20230404-ctx4096.pth:
#ChatRWKV #Homepage #Documentation #Downloads #Open #Source #Project #Benchmarking #ChatGPT #News Fast Delivery