编写测试代码与评价指标
现在的数据集里面只涉及tool的分类和手术phase的分类,所以编写的评价指标还是那些通用的,但是:
predicted_labels:['The current surgical phase is CalotTriangleDissection, Grasper, Hook tool exists.', 'The current surgical phase is GallbladderDissection, Grasper, Hook tool exists.', 'The current surgical phase is GallbladderDissection, Hook tool exists.', 'The current surgical phase is Endostomy, Grasper tool exists.']
true_labels:['The current surgical phase is CalotTriangleDissection, Grasper, Hook tool exists.', 'The current surgical phase is GallbladderDissection, Grasper, Hook tool exists.', 'The current surgical phase is GallbladderDissection, Hook tool exists.', 'The current surgical phase is CleaningCoagulation, Bipolar, Grasper tool exists.', 'The current surgical phase is Preparation, no tool exists.', 'The current surgical phase is CalotTriangleDissection, Grasper, Hook tool exists.']
模型加载问题
Traceback (most recent call last):
File "/mnt/share/toky/Projects/SmolVLM2_Test/Smol_VLM_Video_Test.py", line 14, in <module>
processor = AutoProcessor.from_pretrained(model_path)
File "/mnt/share/toky/CondaEnvs/LM/lib/python3.10/site-packages/transformers/models/auto/processing_auto.py", line 375, in from_pretrained
raise ValueError(
ValueError: Unrecognized processing class in /mnt/share/toky/LLMs/Toky-Generate/SmolVLM2-2.2B-Instruct-video-feedback-Cholec80/. Can't instantiate a processor, a tokenizer, an image processor or a feature extractor for this model. Make sure the repository contains the files of at least one of those processing classes.
保存到本地的时候记得两个都保存:
# 保存模型到本地指定目录
trainer.save_model(save_directory)
processor.save_pretrained(save_directory)
造数据第二弹
接入Kimi:
/mnt/share/toky/CondaEnvs/GenSurgery/bin/python /mnt/share/toky/Projects/GenSurgery/toky_process_dataset/process_cholec80.py
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T17:02:50.000000Z', 'title': '7301HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2902, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T17:02:50.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 1522.04, 'bitrate': 2905, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2902, 'video_fps': 25.0, 'video_duration': 1522.04, 'video_n_frames': 38051}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i /mnt/share/toky/Datasets/Cholec80/videos/video04.mp4 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -ss 1679.520000 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -ss 1.000000 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -ss 1679.645000 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -ss 1.000000 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
<phase>GallbladderPackaging</phase>. Tools: <tool>Grasper, SpecimenBag</tool>. Next step: Carefully grasp the gallbladder and insert it into the specimen bag. Ensure the bag is fully opened and no tissue is torn. Risks: Bleeding from the liver bed or cystic duct stump. Maintain hemostasis and monitor for any signs of bleeding.
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -ss 994.520000 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -ss 1.000000 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -ss 994.645000 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -ss 1.000000 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
<phase>Clipping</phase>, <tool>Grasper</tool>. Next, secure clip on cystic artery. Risk: Bleeding. Ensure clip is properly placed to avoid slippage.
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -ss 683.520000 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -ss 1.000000 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -ss 683.645000 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -ss 1.000000 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
Next step: Secure cystic artery and duct. Use grasper to expose, hook for dissection. Be cautious of bleeding from Calot triangle vessels. Risk: Hemorrhage. Ensure clear visualization and gentle tissue handling to prevent bile duct injury.
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -ss 1801.520000 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -ss 1.000000 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'mp42', 'minor_version': '0', 'compatible_brands': 'mp42isomavc1', 'creation_time': '2017-01-13T18:33:13.000000Z', 'title': '8307HD Bin.01-H264-480', 'encoder': 'HandBrake rev0 2013051899'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [854, 480], 'bitrate': 2606, 'fps': 25.0, 'codec_name': 'h264', 'profile': '(Main)', 'metadata': {'Metadata': '', 'creation_time': '2017-01-13T18:33:13.000000Z', 'vendor_id': '[0][0][0][0]', 'encoder': 'JVT/AVC Coding'}}], 'input_number': 0}], 'duration': 2129.04, 'bitrate': 2609, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(Main)', 'video_size': [854, 480], 'video_bitrate': 2606, 'video_fps': 25.0, 'video_duration': 2129.04, 'video_n_frames': 53226}
/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -ss 1801.645000 -i /mnt/share/toky/Datasets/Cholec80/videos/video25.mp4 -ss 1.000000 -loglevel error -f image2pipe -vf scale=854:480 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
Traceback (most recent call last):
File "/mnt/share/toky/Projects/GenSurgery/toky_process_dataset/process_cholec80.py", line 315, in <module>
process_samples()
File "/mnt/share/toky/Projects/GenSurgery/toky_process_dataset/process_cholec80.py", line 114, in process_samples
process_mode('train', train_samples)
File "/mnt/share/toky/Projects/GenSurgery/toky_process_dataset/process_cholec80.py", line 144, in process_mode
data = build_metadata(
File "/mnt/share/toky/Projects/GenSurgery/toky_process_dataset/process_cholec80.py", line 274, in build_metadata
text_prompt = call_LLM_generate_text_prompt(phase_and_tool_info,frame_save_dir)
File "/mnt/share/toky/Projects/GenSurgery/toky_process_dataset/process_cholec80.py", line 269, in call_LLM_generate_text_prompt
result = kimi_AI(phase_and_tool_info,frame_save_dir)
File "/mnt/share/toky/Projects/GenSurgery/toky_process_dataset/process_cholec80.py", line 259, in kimi_AI
response = client.chat.completions.create(
File "/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/openai/_utils/_utils.py", line 287, in wrapper
return func(*args, **kwargs)
File "/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/openai/resources/chat/completions/completions.py", line 925, in create
return self._post(
File "/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/openai/_base_client.py", line 1239, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
File "/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/openai/_base_client.py", line 1034, in request
raise self._make_status_error_from_response(err.response) from None
openai.RateLimitError: Error code: 429 - {'error': {'message': 'Your account org-375e9bdee55043cabb13ba9fd1e99161<ak-f194u38t148111bw43ti> request reached organization max RPM: 3, please try again after 1 seconds', 'type': 'rate_limit_reached_error'}}
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
File "/mnt/share/toky/CondaEnvs/GenSurgery/lib/python3.10/site-packages/openai/_base_client.py", line 1034, in request
raise self._make_status_error_from_response(err.response) from None
openai.RateLimitError: Error code: 429 - {'error': {'message': 'Your account org-375e9bdee55043cabb13ba9fd1e99161<ak-f194u38t148111bw43ti> request reached organization max RPM: 3, please try again after 1 seconds', 'type': 'rate_limit_reached_error'}}
说明:
这个错误 openai.RateLimitError
及错误码 429
表示你调用 OpenAI API 时达到了组织(organization)的请求速率限制(RPM,Requests Per Minute,每分钟请求数)。
错误信息中明确指出你的账号所在组织 org-375e9bdee55043cabb13ba9fd1e99161
的请求速率达到了上限(最大 RPM 为 3),需要等待 1 秒钟后再尝试。
解决:在调用的时候加一个计时器去限制一下。
豆包的输出情况
[
{
"id": "t25_0000000",
"images": [
"t25_0000000_06.jpg",
"t25_0000000_01.jpg",
"t25_0000000_21.jpg",
"t25_0000000_18.jpg",
"t25_0000000_14.jpg",
"t25_0000000_05.jpg",
"t25_0000000_00.jpg",
"t25_0000000_07.jpg",
"t25_0000000_15.jpg",
"t25_0000000_03.jpg",
"t25_0000000_12.jpg",
"t25_0000000_23.jpg",
"t25_0000000_19.jpg",
"t25_0000000_08.jpg",
"t25_0000000_16.jpg",
"t25_0000000_17.jpg",
"t25_0000000_13.jpg",
"t25_0000000_10.jpg",
"t25_0000000_20.jpg",
"t25_0000000_11.jpg",
"t25_0000000_02.jpg",
"t25_0000000_04.jpg",
"t25_0000000_22.jpg",
"t25_0000000_09.jpg"
],
"text prompt": "<phase>CalotTriangleDissection</phase>\n<tool>Hook, Grasper</tool>\nNext step: Continue dissecting Calot's Triangle to expose cystic duct and artery.\n<risk> Bleeding from nearby blood vessels. </risk>\n<risk level> 2 </risk level>",
"video link": "/mnt/share/toky/Datasets/Toky_Generate/Cholec80VideoFeedback/test/videos_annotated/t25_0000000.mp4",
"visual quality": 4,
"temporal consistency": 4,
"dynamic degree": 4,
"text-to-video alignment": 4,
"factual consistency": 4,
"risk level": 2,
"conversations": [
{
"from": "human",
"value": "You're an experienced gallbladder surgery expert. Based on the provided surgical video frames:\n First, explicitly repeat the current surgical phase <phase>CalotTriangleDissection</phase> and the used tools <tool>Hook, Grasper</tool>.\n Next, concisely present the next-step surgical prompt and navigation information.\n Additionally, highlight potential risks such as bleeding. Enclose the description of potential risks within <risk> </risk> tags. Moreover, infer the current risk level from the images. Represent the risk level as <risk level> [0 - 4] </risk level>, where 0 indicates no risk and 4 represents the highest risk level.\n Your entire response must not exceed 100 words and should include <phase></phase>, <tool> </tool>, and <risk> </risk> <risk level> </risk level> tags.\n "
},
{
"from": "gpt",
"value": "<phase>CalotTriangleDissection</phase>\n<tool>Hook, Grasper</tool>\nNext step: Continue dissecting Calot's Triangle to expose cystic duct and artery.\n<risk> Bleeding from nearby blood vessels. </risk>\n<risk level> 2 </risk level>"
}
]
},
{
"id": "t04_0000001",
"images": [
"t04_0000001_16.jpg",
"t04_0000001_11.jpg",
"t04_0000001_00.jpg",
"t04_0000001_10.jpg",
"t04_0000001_04.jpg",
"t04_0000001_05.jpg",
"t04_0000001_22.jpg",
"t04_0000001_01.jpg",
"t04_0000001_15.jpg",
"t04_0000001_08.jpg",
"t04_0000001_20.jpg",
"t04_0000001_06.jpg",
"t04_0000001_09.jpg",
"t04_0000001_13.jpg",
"t04_0000001_14.jpg",
"t04_0000001_03.jpg",
"t04_0000001_19.jpg",
"t04_0000001_23.jpg",
"t04_0000001_18.jpg",
"t04_0000001_21.jpg",
"t04_0000001_07.jpg",
"t04_0000001_17.jpg",
"t04_0000001_02.jpg",
"t04_0000001_12.jpg"
],
"text prompt": "<phase>CalotTriangleDissection</phase>\n<tool>Hook, Grasper</tool>\nNext step: Continue dissecting Calot's triangle to identify cystic duct and artery.\n<risk> Bleeding from vascular structures in Calot's triangle. </risk>\n<risk level> 2 </risk level>",
"video link": "/mnt/share/toky/Datasets/Toky_Generate/Cholec80VideoFeedback/test/videos_annotated/t04_0000001.mp4",
"visual quality": 4,
"temporal consistency": 4,
"dynamic degree": 4,
"text-to-video alignment": 4,
"factual consistency": 4,
"risk level": 2,
"conversations": [
{
"from": "human",
"value": "You're an experienced gallbladder surgery expert. Based on the provided surgical video frames:\n First, explicitly repeat the current surgical phase <phase>CalotTriangleDissection</phase> and the used tools <tool>Hook, Grasper</tool>.\n Next, concisely present the next-step surgical prompt and navigation information.\n Additionally, highlight potential risks such as bleeding. Enclose the description of potential risks within <risk> </risk> tags. Moreover, infer the current risk level from the images. Represent the risk level as <risk level> [0 - 4] </risk level>, where 0 indicates no risk and 4 represents the highest risk level.\n Your entire response must not exceed 100 words and should include <phase></phase>, <tool> </tool>, and <risk> </risk> <risk level> </risk level> tags.\n "
},
{
"from": "gpt",
"value": "<phase>CalotTriangleDissection</phase>\n<tool>Hook, Grasper</tool>\nNext step: Continue dissecting Calot's triangle to identify cystic duct and artery.\n<risk> Bleeding from vascular structures in Calot's triangle. </risk>\n<risk level> 2 </risk level>"
}
]
},
{
"id": "t25_0000002",
"images": [
"t25_0000002_08.jpg",
"t25_0000002_15.jpg",
"t25_0000002_02.jpg",
"t25_0000002_16.jpg",
"t25_0000002_21.jpg",
"t25_0000002_12.jpg",
"t25_0000002_14.jpg",
"t25_0000002_19.jpg",
"t25_0000002_22.jpg",
"t25_0000002_04.jpg",
"t25_0000002_20.jpg",
"t25_0000002_06.jpg",
"t25_0000002_01.jpg",
"t25_0000002_10.jpg",
"t25_0000002_17.jpg",
"t25_0000002_18.jpg",
"t25_0000002_00.jpg",
"t25_0000002_13.jpg",
"t25_0000002_03.jpg",
"t25_0000002_07.jpg",
"t25_0000002_23.jpg",
"t25_0000002_05.jpg",
"t25_0000002_09.jpg",
"t25_0000002_11.jpg"
],
"text prompt": "<phase>GallbladderDissection</phase>\n<tool>Grasper</tool>\nNext step: Continue dissecting the gallbladder. Navigate carefully around adjacent structures.\n<risk> Bleeding from nearby blood vessels. </risk>\n<risk level> 2 </risk level>",
"video link": "/mnt/share/toky/Datasets/Toky_Generate/Cholec80VideoFeedback/test/videos_annotated/t25_0000002.mp4",
"visual quality": 4,
"temporal consistency": 4,
"dynamic degree": 4,
"text-to-video alignment": 4,
"factual consistency": 4,
"risk level": 2,
"conversations": [
{
"from": "human",
"value": "You're an experienced gallbladder surgery expert. Based on the provided surgical video frames:\n First, explicitly repeat the current surgical phase <phase>GallbladderDissection</phase> and the used tools <tool>Grasper</tool>.\n Next, concisely present the next-step surgical prompt and navigation information.\n Additionally, highlight potential risks such as bleeding. Enclose the description of potential risks within <risk> </risk> tags. Moreover, infer the current risk level from the images. Represent the risk level as <risk level> [0 - 4] </risk level>, where 0 indicates no risk and 4 represents the highest risk level.\n Your entire response must not exceed 100 words and should include <phase></phase>, <tool> </tool>, and <risk> </risk> <risk level> </risk level> tags.\n "
},
{
"from": "gpt",
"value": "<phase>GallbladderDissection</phase>\n<tool>Grasper</tool>\nNext step: Continue dissecting the gallbladder. Navigate carefully around adjacent structures.\n<risk> Bleeding from nearby blood vessels. </risk>\n<risk level> 2 </risk level>"
}
]
}
]
Kimi生成的数据样例
[
{
"id": "t25_0000000",
"images": [
"t25_0000000_06.jpg",
"t25_0000000_01.jpg",
"t25_0000000_21.jpg",
"t25_0000000_18.jpg",
"t25_0000000_14.jpg",
"t25_0000000_05.jpg",
"t25_0000000_00.jpg",
"t25_0000000_07.jpg",
"t25_0000000_15.jpg",
"t25_0000000_03.jpg",
"t25_0000000_12.jpg",
"t25_0000000_23.jpg",
"t25_0000000_19.jpg",
"t25_0000000_08.jpg",
"t25_0000000_16.jpg",
"t25_0000000_17.jpg",
"t25_0000000_13.jpg",
"t25_0000000_10.jpg",
"t25_0000000_20.jpg",
"t25_0000000_11.jpg",
"t25_0000000_02.jpg",
"t25_0000000_04.jpg",
"t25_0000000_22.jpg",
"t25_0000000_09.jpg"
],
"text prompt": "<phase>CalotTriangleDissection</phase> is currently underway using a <tool>Hook</tool>. Next, carefully identify and clip the cystic artery and cystic duct. <risk>Risks include bleeding from the cystic artery or duct.</risk> <risk level>1</risk level>",
"video link": "/mnt/share/toky/Datasets/Toky_Generate/Cholec80VideoFeedback/train/videos_annotated/t25_0000000.mp4",
"visual quality": 4,
"temporal consistency": 4,
"dynamic degree": 4,
"text-to-video alignment": 4,
"factual consistency": 4,
"risk level": 1,
"conversations": [
{
"from": "human",
"value": "You're an experienced gallbladder surgery expert. Based on the provided surgical video frames:\n First, explicitly repeat the current surgical phase <phase>CalotTriangleDissection</phase> and the used tools <tool>Hook</tool>.\n Next, concisely present the next-step surgical prompt and navigation information.\n Additionally, highlight potential risks such as bleeding. Enclose the description of potential risks within <risk> </risk> tags. Moreover, infer the current risk level from the images. Represent the risk level as <risk level> [0 - 4] </risk level>, where 0 indicates no risk and 4 represents the highest risk level.\n Your entire response must not exceed 100 words and should include <phase></phase>, <tool> </tool>, and <risk> </risk> <risk level> </risk level> tags.\n "
},
{
"from": "gpt",
"value": "<phase>CalotTriangleDissection</phase> is currently underway using a <tool>Hook</tool>. Next, carefully identify and clip the cystic artery and cystic duct. <risk>Risks include bleeding from the cystic artery or duct.</risk> <risk level>1</risk level>"
}
]
},
{
"id": "t25_0000001",
"images": [
"t25_0000001_13.jpg",
"t25_0000001_15.jpg",
"t25_0000001_12.jpg",
"t25_0000001_06.jpg",
"t25_0000001_21.jpg",
"t25_0000001_10.jpg",
"t25_0000001_20.jpg",
"t25_0000001_02.jpg",
"t25_0000001_19.jpg",
"t25_0000001_04.jpg",
"t25_0000001_07.jpg",
"t25_0000001_09.jpg",
"t25_0000001_16.jpg",
"t25_0000001_17.jpg",
"t25_0000001_23.jpg",
"t25_0000001_18.jpg",
"t25_0000001_14.jpg",
"t25_0000001_11.jpg",
"t25_0000001_00.jpg",
"t25_0000001_03.jpg",
"t25_0000001_22.jpg",
"t25_0000001_01.jpg",
"t25_0000001_08.jpg",
"t25_0000001_05.jpg"
],
"text prompt": "<phase>GallbladderDissection</phase> is currently underway using a <tool>Hook</tool>. Next, carefully retract the gallbladder to expose the cystic duct and artery for ligation. <risk>Bleeding</risk> is a potential risk, especially during dissection near the liver bed. <risk level>2</risk level> indicates moderate risk due to the proximity to major blood vessels.",
"video link": "/mnt/share/toky/Datasets/Toky_Generate/Cholec80VideoFeedback/train/videos_annotated/t25_0000001.mp4",
"visual quality": 4,
"temporal consistency": 4,
"dynamic degree": 4,
"text-to-video alignment": 4,
"factual consistency": 4,
"risk level": 2,
"conversations": [
{
"from": "human",
"value": "You're an experienced gallbladder surgery expert. Based on the provided surgical video frames:\n First, explicitly repeat the current surgical phase <phase>GallbladderDissection</phase> and the used tools <tool>Hook</tool>.\n Next, concisely present the next-step surgical prompt and navigation information.\n Additionally, highlight potential risks such as bleeding. Enclose the description of potential risks within <risk> </risk> tags. Moreover, infer the current risk level from the images. Represent the risk level as <risk level> [0 - 4] </risk level>, where 0 indicates no risk and 4 represents the highest risk level.\n Your entire response must not exceed 100 words and should include <phase></phase>, <tool> </tool>, and <risk> </risk> <risk level> </risk level> tags.\n "
},
{
"from": "gpt",
"value": "<phase>GallbladderDissection</phase> is currently underway using a <tool>Hook</tool>. Next, carefully retract the gallbladder to expose the cystic duct and artery for ligation. <risk>Bleeding</risk> is a potential risk, especially during dissection near the liver bed. <risk level>2</risk level> indicates moderate risk due to the proximity to major blood vessels."
}
]
},
{
"id": "t04_0000002",
"images": [
"t04_0000002_02.jpg",
"t04_0000002_15.jpg",
"t04_0000002_12.jpg",
"t04_0000002_21.jpg",
"t04_0000002_10.jpg",
"t04_0000002_08.jpg",
"t04_0000002_14.jpg",
"t04_0000002_03.jpg",
"t04_0000002_09.jpg",
"t04_0000002_18.jpg",
"t04_0000002_13.jpg",
"t04_0000002_19.jpg",
"t04_0000002_01.jpg",
"t04_0000002_05.jpg",
"t04_0000002_06.jpg",
"t04_0000002_07.jpg",
"t04_0000002_04.jpg",
"t04_0000002_22.jpg",
"t04_0000002_00.jpg",
"t04_0000002_17.jpg",
"t04_0000002_16.jpg",
"t04_0000002_23.jpg",
"t04_0000002_11.jpg",
"t04_0000002_20.jpg"
],
"text prompt": "<phase>Calot Triangle Dissection</phase> is currently underway using a <tool>Grasper</tool>. Next, carefully identify and clip the cystic artery and cystic duct. <risk>Risks include bleeding from the cystic artery or duct.</risk> <risk level>1</risk level>",
"video link": "/mnt/share/toky/Datasets/Toky_Generate/Cholec80VideoFeedback/train/videos_annotated/t04_0000002.mp4",
"visual quality": 4,
"temporal consistency": 4,
"dynamic degree": 4,
"text-to-video alignment": 4,
"factual consistency": 4,
"risk level": 1,
"conversations": [
{
"from": "human",
"value": "You're an experienced gallbladder surgery expert. Based on the provided surgical video frames:\n First, explicitly repeat the current surgical phase <phase>CalotTriangleDissection</phase> and the used tools <tool>Grasper</tool>.\n Next, concisely present the next-step surgical prompt and navigation information.\n Additionally, highlight potential risks such as bleeding. Enclose the description of potential risks within <risk> </risk> tags. Moreover, infer the current risk level from the images. Represent the risk level as <risk level> [0 - 4] </risk level>, where 0 indicates no risk and 4 represents the highest risk level.\n Your entire response must not exceed 100 words and should include <phase></phase>, <tool> </tool>, and <risk> </risk> <risk level> </risk level> tags.\n "
},
{
"from": "gpt",
"value": "<phase>Calot Triangle Dissection</phase> is currently underway using a <tool>Grasper</tool>. Next, carefully identify and clip the cystic artery and cystic duct. <risk>Risks include bleeding from the cystic artery or duct.</risk> <risk level>1</risk level>"
}
]
},
{
"id": "t25_0000003",
"images": [
"t25_0000003_07.jpg",
"t25_0000003_14.jpg",
"t25_0000003_20.jpg",
"t25_0000003_18.jpg",
"t25_0000003_03.jpg",
"t25_0000003_05.jpg",
"t25_0000003_12.jpg",
"t25_0000003_10.jpg",
"t25_0000003_06.jpg",
"t25_0000003_17.jpg",
"t25_0000003_01.jpg",
"t25_0000003_23.jpg",
"t25_0000003_15.jpg",
"t25_0000003_19.jpg",
"t25_0000003_09.jpg",
"t25_0000003_11.jpg",
"t25_0000003_13.jpg",
"t25_0000003_08.jpg",
"t25_0000003_16.jpg",
"t25_0000003_04.jpg",
"t25_0000003_21.jpg",
"t25_0000003_00.jpg",
"t25_0000003_02.jpg",
"t25_0000003_22.jpg"
],
"text prompt": "<phase>GallbladderPackaging</phase>. <tool>Grasper, SpecimenBag</tool>. Next, ensure the gallbladder is fully contained within the bag. Carefully exteriorize the specimen bag through the incision. <risk>Risks include bleeding from the liver bed or cystic duct stump. Monitor hemostasis closely.</risk> <risk level>1</risk level>.",
"video link": "/mnt/share/toky/Datasets/Toky_Generate/Cholec80VideoFeedback/train/videos_annotated/t25_0000003.mp4",
"visual quality": 4,
"temporal consistency": 4,
"dynamic degree": 4,
"text-to-video alignment": 4,
"factual consistency": 4,
"risk level": 1,
"conversations": [
{
"from": "human",
"value": "You're an experienced gallbladder surgery expert. Based on the provided surgical video frames:\n First, explicitly repeat the current surgical phase <phase>GallbladderPackaging</phase> and the used tools <tool>Grasper, SpecimenBag</tool>.\n Next, concisely present the next-step surgical prompt and navigation information.\n Additionally, highlight potential risks such as bleeding. Enclose the description of potential risks within <risk> </risk> tags. Moreover, infer the current risk level from the images. Represent the risk level as <risk level> [0 - 4] </risk level>, where 0 indicates no risk and 4 represents the highest risk level.\n Your entire response must not exceed 100 words and should include <phase></phase>, <tool> </tool>, and <risk> </risk> <risk level> </risk level> tags.\n "
},
{
"from": "gpt",
"value": "<phase>GallbladderPackaging</phase>. <tool>Grasper, SpecimenBag</tool>. Next, ensure the gallbladder is fully contained within the bag. Carefully exteriorize the specimen bag through the incision. <risk>Risks include bleeding from the liver bed or cystic duct stump. Monitor hemostasis closely.</risk> <risk level>1</risk level>."
}
]
},
{
"id": "t25_0000004",
"images": [
"t25_0000004_01.jpg",
"t25_0000004_16.jpg",
"t25_0000004_09.jpg",
"t25_0000004_11.jpg",
"t25_0000004_20.jpg",
"t25_0000004_14.jpg",
"t25_0000004_22.jpg",
"t25_0000004_07.jpg",
"t25_0000004_02.jpg",
"t25_0000004_08.jpg",
"t25_0000004_05.jpg",
"t25_0000004_21.jpg",
"t25_0000004_12.jpg",
"t25_0000004_17.jpg",
"t25_0000004_06.jpg",
"t25_0000004_03.jpg",
"t25_0000004_10.jpg",
"t25_0000004_04.jpg",
"t25_0000004_18.jpg",
"t25_0000004_15.jpg",
"t25_0000004_19.jpg",
"t25_0000004_00.jpg",
"t25_0000004_13.jpg",
"t25_0000004_23.jpg"
],
"text prompt": "<phase>Calot Triangle Dissection</phase> is currently underway using <tool>Grasper and Hook</tool>. Next, carefully identify and clip the cystic artery and cystic duct. <risk>Risks include bleeding from the cystic artery or duct.</risk> <risk level>1</risk level>",
"video link": "/mnt/share/toky/Datasets/Toky_Generate/Cholec80VideoFeedback/train/videos_annotated/t25_0000004.mp4",
"visual quality": 4,
"temporal consistency": 4,
"dynamic degree": 4,
"text-to-video alignment": 4,
"factual consistency": 4,
"risk level": 1,
"conversations": [
{
"from": "human",
"value": "You're an experienced gallbladder surgery expert. Based on the provided surgical video frames:\n First, explicitly repeat the current surgical phase <phase>CalotTriangleDissection</phase> and the used tools <tool>Grasper, Hook</tool>.\n Next, concisely present the next-step surgical prompt and navigation information.\n Additionally, highlight potential risks such as bleeding. Enclose the description of potential risks within <risk> </risk> tags. Moreover, infer the current risk level from the images. Represent the risk level as <risk level> [0 - 4] </risk level>, where 0 indicates no risk and 4 represents the highest risk level.\n Your entire response must not exceed 100 words and should include <phase></phase>, <tool> </tool>, and <risk> </risk> <risk level> </risk level> tags.\n "
},
{
"from": "gpt",
"value": "<phase>Calot Triangle Dissection</phase> is currently underway using <tool>Grasper and Hook</tool>. Next, carefully identify and clip the cystic artery and cystic duct. <risk>Risks include bleeding from the cystic artery or duct.</risk> <risk level>1</risk level>"
}
]
},
{
"id": "t25_0000005",
"images": [
"t25_0000005_02.jpg",
"t25_0000005_13.jpg",
"t25_0000005_14.jpg",
"t25_0000005_01.jpg",
"t25_0000005_04.jpg",
"t25_0000005_20.jpg",
"t25_0000005_12.jpg",
"t25_0000005_23.jpg",
"t25_0000005_18.jpg",
"t25_0000005_03.jpg",
"t25_0000005_16.jpg",
"t25_0000005_07.jpg",
"t25_0000005_21.jpg",
"t25_0000005_19.jpg",
"t25_0000005_17.jpg",
"t25_0000005_00.jpg",
"t25_0000005_05.jpg",
"t25_0000005_15.jpg",
"t25_0000005_09.jpg",
"t25_0000005_22.jpg",
"t25_0000005_06.jpg",
"t25_0000005_08.jpg",
"t25_0000005_11.jpg",
"t25_0000005_10.jpg"
],
"text prompt": "<phase>Calot Triangle Dissection</phase> is currently underway using a <tool>Hook</tool>. Next, carefully identify and clip the cystic artery and cystic duct. <risk>Risks include bleeding from the cystic artery or duct.</risk> <risk level>1</risk level>",
"video link": "/mnt/share/toky/Datasets/Toky_Generate/Cholec80VideoFeedback/train/videos_annotated/t25_0000005.mp4",
"visual quality": 4,
"temporal consistency": 4,
"dynamic degree": 4,
"text-to-video alignment": 4,
"factual consistency": 4,
"risk level": 1,
"conversations": [
{
"from": "human",
"value": "You're an experienced gallbladder surgery expert. Based on the provided surgical video frames:\n First, explicitly repeat the current surgical phase <phase>CalotTriangleDissection</phase> and the used tools <tool>Hook</tool>.\n Next, concisely present the next-step surgical prompt and navigation information.\n Additionally, highlight potential risks such as bleeding. Enclose the description of potential risks within <risk> </risk> tags. Moreover, infer the current risk level from the images. Represent the risk level as <risk level> [0 - 4] </risk level>, where 0 indicates no risk and 4 represents the highest risk level.\n Your entire response must not exceed 100 words and should include <phase></phase>, <tool> </tool>, and <risk> </risk> <risk level> </risk level> tags.\n "
},
{
"from": "gpt",
"value": "<phase>Calot Triangle Dissection</phase> is currently underway using a <tool>Hook</tool>. Next, carefully identify and clip the cystic artery and cystic duct. <risk>Risks include bleeding from the cystic artery or duct.</risk> <risk level>1</risk level>"
}
]
},
{
"id": "t25_0000006",
"images": [
"t25_0000006_17.jpg",
"t25_0000006_16.jpg",
"t25_0000006_05.jpg",
"t25_0000006_08.jpg",
"t25_0000006_07.jpg",
"t25_0000006_11.jpg",
"t25_0000006_02.jpg",
"t25_0000006_06.jpg",
"t25_0000006_20.jpg",
"t25_0000006_04.jpg",
"t25_0000006_18.jpg",
"t25_0000006_15.jpg",
"t25_0000006_13.jpg",
"t25_0000006_03.jpg",
"t25_0000006_21.jpg",
"t25_0000006_19.jpg",
"t25_0000006_09.jpg",
"t25_0000006_22.jpg",
"t25_0000006_14.jpg",
"t25_0000006_10.jpg",
"t25_0000006_12.jpg",
"t25_0000006_01.jpg",
"t25_0000006_00.jpg",
"t25_0000006_23.jpg"
],
"text prompt": "<phase>GallbladderDissection</phase>. <tool>Grasper, Hook</tool> are being used. Next, carefully retract the gallbladder and continue dissection to free it from liver bed. <risk>Bleeding from liver bed or cystic artery</risk> <risk level>1</risk level>.",
"video link": "/mnt/share/toky/Datasets/Toky_Generate/Cholec80VideoFeedback/train/videos_annotated/t25_0000006.mp4",
"visual quality": 4,
"temporal consistency": 4,
"dynamic degree": 4,
"text-to-video alignment": 4,
"factual consistency": 4,
"risk level": 1,
"conversations": [
{
"from": "human",
"value": "You're an experienced gallbladder surgery expert. Based on the provided surgical video frames:\n First, explicitly repeat the current surgical phase <phase>GallbladderDissection</phase> and the used tools <tool>Grasper, Hook</tool>.\n Next, concisely present the next-step surgical prompt and navigation information.\n Additionally, highlight potential risks such as bleeding. Enclose the description of potential risks within <risk> </risk> tags. Moreover, infer the current risk level from the images. Represent the risk level as <risk level> [0 - 4] </risk level>, where 0 indicates no risk and 4 represents the highest risk level.\n Your entire response must not exceed 100 words and should include <phase></phase>, <tool> </tool>, and <risk> </risk> <risk level> </risk level> tags.\n "
},
{
"from": "gpt",
"value": "<phase>GallbladderDissection</phase>. <tool>Grasper, Hook</tool> are being used. Next, carefully retract the gallbladder and continue dissection to free it from liver bed. <risk>Bleeding from liver bed or cystic artery</risk> <risk level>1</risk level>."
}
]
}
]
总的来说,kimi输出的起码 risk level 还是不一样的。成本上来讲,kimi应该是更便宜一些,还免费送了15块钱。
重新测试加载数据集
发现重新加载时有奇怪的缓存,还沿用之前的数据集的text prompt,改代码重新加载数据集才行。
由于重新加了一个risk level 的数值,所以不匹配,报错。
Generating train split: 0%| | 0/32901 [00:00<?, ? examples/s]Failed to read file '/mnt/share/toky/Datasets/Toky_Generate/Cholec80VideoFeedback/annotated/train-00000-of-00001.parquet' with error <class 'datasets.table.CastError'>: Couldn't cast
id: string
images: list<element: string>
child 0, element: string
text prompt: string
video link: string
visual quality: int64
temporal consistency: int64
dynamic degree: int64
text-to-video alignment: int64
factual consistency: int64
risk level: int64
conversations: list<element: struct<from: string, value: string>>
child 0, element: struct<from: string, value: string>
child 0, from: string
child 1, value: string
-- schema metadata --
pandas: '{"index_columns": [{"kind": "range", "name": null, "start": 0, "' + 1637
to
{'id': Value(dtype='string', id=None), 'images': Sequence(feature=Value(dtype='string', id=None), length=-1, id=None), 'text prompt': Value(dtype='string', id=None), 'video link': Value(dtype='string', id=None), 'visual quality': Value(dtype='int64', id=None), 'temporal consistency': Value(dtype='int64', id=None), 'dynamic degree': Value(dtype='int64', id=None), 'text-to-video alignment': Value(dtype='int64', id=None), 'factual consistency': Value(dtype='int64', id=None), 'conversations': [{'from': Value(dtype='string', id=None), 'value': Value(dtype='string', id=None)}]}
because column names don't match
Generating train split: 0%| | 0/32901 [00:00<?, ? examples/s]
Traceback (most recent call last):
File "/mnt/share/toky/CondaEnvs/LM/lib/python3.10/site-packages/datasets/builder.py", line 1855, in _prepare_split_single
for _, table in generator:
File "/mnt/share/toky/CondaEnvs/LM/lib/python3.10/site-packages/datasets/packaged_modules/parquet/parquet.py", line 106, in _generate_tables
yield f"{file_idx}_{batch_idx}", self._cast_table(pa_table)
File "/mnt/share/toky/CondaEnvs/LM/lib/python3.10/site-packages/datasets/packaged_modules/parquet/parquet.py", line 73, in _cast_table
pa_table = table_cast(pa_table, self.info.features.arrow_schema)
File "/mnt/share/toky/CondaEnvs/LM/lib/python3.10/site-packages/datasets/table.py", line 2293, in table_cast
return cast_table_to_schema(table, schema)
File "/mnt/share/toky/CondaEnvs/LM/lib/python3.10/site-packages/datasets/table.py", line 2241, in cast_table_to_schema
raise CastError(
datasets.table.CastError: Couldn't cast
id: string
images: list<element: string>
child 0, element: string
text prompt: string
video link: string
visual quality: int64
temporal consistency: int64
dynamic degree: int64
text-to-video alignment: int64
factual consistency: int64
risk level: int64
conversations: list<element: struct<from: string, value: string>>
child 0, element: struct<from: string, value: string>
child 0, from: string
child 1, value: string
-- schema metadata --
pandas: '{"index_columns": [{"kind": "range", "name": null, "start": 0, "' + 1637
to
{'id': Value(dtype='string', id=None), 'images': Sequence(feature=Value(dtype='string', id=None), length=-1, id=None), 'text prompt': Value(dtype='string', id=None), 'video link': Value(dtype='string', id=None), 'visual quality': Value(dtype='int64', id=None), 'temporal consistency': Value(dtype='int64', id=None), 'dynamic degree': Value(dtype='int64', id=None), 'text-to-video alignment': Value(dtype='int64', id=None), 'factual consistency': Value(dtype='int64', id=None), 'conversations': [{'from': Value(dtype='string', id=None), 'value': Value(dtype='string', id=None)}]}
because column names don't match
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/mnt/share/toky/Projects/SmolVLM2_Test/Smol_VLM_Video_Test.py", line 17, in <module>
ds = load_dataset("/mnt/share/toky/Datasets/Toky_Generate/Cholec80VideoFeedback", name="annotated")
File "/mnt/share/toky/CondaEnvs/LM/lib/python3.10/site-packages/datasets/load.py", line 2084, in load_dataset
builder_instance.download_and_prepare(
File "/mnt/share/toky/CondaEnvs/LM/lib/python3.10/site-packages/datasets/builder.py", line 925, in download_and_prepare
self._download_and_prepare(
File "/mnt/share/toky/CondaEnvs/LM/lib/python3.10/site-packages/datasets/builder.py", line 1001, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "/mnt/share/toky/CondaEnvs/LM/lib/python3.10/site-packages/datasets/builder.py", line 1742, in _prepare_split
for job_id, done, content in self._prepare_split_single(
File "/mnt/share/toky/CondaEnvs/LM/lib/python3.10/site-packages/datasets/builder.py", line 1898, in _prepare_split_single
raise DatasetGenerationError("An error occurred while generating the dataset") from e
datasets.exceptions.DatasetGenerationError: An error occurred while generating the dataset
那就在readme里面加上:
还有就是老步骤
datasets.exceptions.NonMatchingSplitsSizesError: [{'expected': SplitInfo(name='test', num_bytes=1348268, num_examples=680, shard_lengths=None, dataset_name=None), 'recorded': SplitInfo(name='test', num_bytes=12342, num_examples=6, shard_lengths=None, dataset_name='cholec80_video_feedback')}, {'expected': SplitInfo(name='train', num_bytes=65281005, num_examples=32901, shard_lengths=None, dataset_name=None), 'recorded': SplitInfo(name='train', num_bytes=28919, num_examples=14, shard_lengths=None, dataset_name='cholec80_video_feedback')}]