SAM is a strong imaginative and prescient basis mannequin that may section any object inside a picture based mostly on consumer interplay prompts. SAM gained important traction within the pc imaginative and prescient group on launch for its accuracy. Nonetheless, SAM’s intensive use of the computationally costly Transformer (ViT) structure limits its sensible functions, significantly in real-time situations.
FastSAM is an open supply picture segmentation mannequin skilled on 2% of the SA-1B dataset on which Section Something Mannequin (SAM) was skilled. FastSAM reportedly runs 50 instances sooner than SAM.
FastSAM overcomes the computation necessities barrier related to utilizing SAM by using a decoupled strategy. FastSAM divided the segmentation process into two sequential levels: all-instance segmentation and prompt-guided choice.
On this weblog submit, we’ll discover FastSAM, highlighting its benefits over SAM, and supply sensible code examples for numerous picture segmentation duties together with segmenting capsule depend, and chip elements.
đź’ˇ
The way to Use FastSAM
On this information, we’ll use SAM and FastSAM collectively to visually examine their efficiency. If you’re solely all for FastSAM, you possibly can skip the elements the place we present putting in and utilizing SAM beneath. Let’s dive into FastSAM!
Step #1: Set up FastSAM and SAM
First, let’s set up FastSAM and SAM together with their required dependencies:
!git clone https://github.com/CASIA-IVA-Lab/FastSAM.git
!pip -q set up -r FastSAM/necessities.txt
!pip -q set up git+https://github.com/openai/CLIP.git roboflow supervision !pip -q set up git+https://github.com/facebookresearch/segment-anything.git
!wget -P FastSAM/weights https://huggingface.co/areas/An-619/FastSAM/resolve/principal/weights/FastSAM.pt
!wget -P FastSAM/weights https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
Step #2: Import Libraries
Subsequent, we’ll import the required libraries and cargo the FastSAM mannequin:
from fastsam import FastSAM, FastSAMPrompt
from segment_anything import sam_model_registry, SamAutomaticMaskGenerator, SamPredictor
import supervision as sv
import roboflow
from roboflow import Roboflow mannequin = FastSAM('./weights/FastSAM.pt')
Step #3: Visualize Masks utilizing FastSAM
Let’s visualize the segmentation masks generated by FastSAM on just a few instance pictures.
retina_masks=True
parameter determines whether or not the mannequin makes use of retina masks for producing segmentation masks.imgsz=1024
units the enter picture measurement to 1024×1024 pixels for processing by the mannequin.conf=0.4
units the minimal confidence threshold for object detectioniou=0.9
units the minimal intersection over union threshold for non-maximum suppression to filter out duplicate detections.
folder = './pictures/' Pictures = ['tool1.jpg', 'bone.jpg', 'stamp.jpg', 'plant1.jpg', 'chip.jpeg', 'pill1.png'] for idx, img_name in enumerate(Pictures): path = os.path.be part of(folder, img_name) everything_results = mannequin(path, machine=DEVICE, retina_masks=True, imgsz=1024, conf=0.4, iou=0.9) prompt_process = FastSAMPrompt(path, everything_results, machine=DEVICE) ann = prompt_process.everything_prompt() output_filename = f'output_{idx}.jpg' output_path = os.path.be part of('./output/', output_filename) prompt_process.plot(annotations=ann, output=output_path)
The above code snippet loops via a listing of instance pictures and generates segmentation masks utilizing FastSAM. The ensuing masks are then visualized and saved as output pictures.
FastSAM on Roboflow Benchmark Dataset
To additional reveal FastSAM’s capabilities, let’s apply it to the Roboflow benchmark dataset. We are going to use textual content prompts to information the segmentation course of.
Obtain the Roboflow Benchmark Dataset
Subsequent, we’ll obtain the Roboflow benchmark dataset, particularly the pictures from the coaching set:
roboflow.login() rf = Roboflow()
venture = rf.workspace("roboticfish").venture("underwater_object_detection")
dataset = venture.model(8).obtain("yolov8")
train_folder = os.path.be part of(dataset.location, 'prepare', 'pictures')
Apply FastSAM with Textual content Prompts
Now, let’s apply FastSAM to a picture from the Roboflow benchmark dataset utilizing textual content prompts. On this instance, we’ll use decide one picture and supply the immediate “Penguin” to information the segmentation course of.
everything_results = mannequin(image_path, machine=DEVICE, retina_masks=True, imgsz=1024, conf=0.4, iou=0.9,)
prompt_process = FastSAMPrompt(image_path, everything_results, machine=DEVICE)
ann = prompt_process.text_prompt(textual content='Penguin')
prompt_process.plot(annotations=ann, output='./output/')
The code snippet above masses a picture from the Roboflow dataset and applies FastSAM with the textual content immediate “Penguin”. The ensuing segmentation masks is visualized and saved as an output picture.
SAM vs. FastSAM: Visualization of Distinct Masks
To match the segmentation masks generated by SAM and FastSAM, let’s extract the masks from each fashions and visualize them.
We apply FastSAM to an instance picture and extracts the segmentation masks utilizing the everything_prompt() operate. The ensuing masks is visualized and saved as an output picture.
IMAGE_PATH = './pictures/real3.jpeg'
everything_results = mannequin(IMAGE_PATH, machine=DEVICE, retina_masks=True, imgsz=1024, conf=0.4, iou=0.9,)
prompt_process = FastSAMPrompt(IMAGE_PATH, everything_results, machine=DEVICE)
ann = prompt_process.everything_prompt()
prompt_process.plot(annotations=ann, output='./output/')
Within the beneath code snippet, we extract the segmentation masks utilizing SAM for a similar instance picture. The ensuing masks is saved within the sam_masks
variable.
picture = cv2.cvtColor(cv2.imread(IMAGE_PATH), cv2.COLOR_BGR2RGB)
sam_checkpoint = "sam_vit_h_4b8939.pth"
model_type = "vit_h"
sam = sam_model_registry[model_type](checkpoint=sam_checkpoint)
sam.to(machine=DEVICE)
mask_generator = SamAutomaticMaskGenerator(sam)
sam_masks = mask_generator.generate(picture)
fastsam_mask_np = ann.cpu().numpy()
Examine between FastSAM and SAM Segmentation Masks
The code beneath compares the segmentation masks generated by FastSAM and SAM. The FastSAM output picture and the annotated picture with SAM masks are displayed facet by facet for visible comparability.
fastsam_output = cv2.imread("./output/real3.jpeg")
image_bgr = cv2.imread(IMAGE_PATH)
mask_annotator = sv.MaskAnnotator()
detections = sv.Detections.from_sam(sam_result=sam_masks)
annotated_image = mask_annotator.annotate(scene=picture.copy(), detections=detections) sv.plot_images_grid( pictures=[fastsam_output, annotated_image], grid_size=(1, 2), titles=['FastSAM segmented image', 'SAM segmented image']
)
Visualize Distinct Masks
We visualize the distinct masks by evaluating the masks generated by FastSAM and SAM. The masks with an intersection over union (IOU) beneath 0.05 are thought of distinct and displayed on high of the unique picture.
Quick Section Something Mannequin (FastSAM) is a strong addition to the Section Something Mannequin (SAM) for picture segmentation duties. FastSAM overcomes the computational limitations of SAM by using a decoupled strategy and using a Convolutional Neural Community (CNN)-based detector. This permits FastSAM to attain real-time segmentation with out compromising efficiency high quality considerably.
By coaching the CNN detector on solely 2% of the SA-1B dataset, FastSAM achieves comparable efficiency to SAM whereas operating 50 instances sooner. This makes FastSAM extra appropriate for situations the place velocity is crucial.
The code examples supplied above reveal the best way to set up and use FastSAM, visualize segmentation masks generated by the mannequin, apply FastSAM with textual content prompts, and examine the segmentation masks of FastSAM and SAM. FastSAM proves to be an environment friendly and correct software for numerous picture segmentation duties, together with segmenting capsule depend and chip elements.