SAM: Segment Anything Usage Record

Last week, Facebook released a big model of semantic segmentation, which is a major breakthrough in the field of semantic segmentation since the release of ChatGPT, a big model of NLP. It marks that the big model in the field of deep learning is almost the trend of future applications. When I first saw the paper, I was actually very excited, and my mood could not be calm for a long time. At present, the segment anything source code and model parameters have been released on github. Friends with good hands-on ability should have already run the code, but only run the code. The revolutionary of this model is that it can trigger learning with no samples and low samples. Its role should not be just simple segmentation. After trying the official simple demo on the official website, we found that it has great potential in the field of remote sensing. Then we explored for a while and found the scientific graphics display library of napari. It will be able to complete some very shocking operations, and it can be imagined that the application of this model will greatly change the traditional remote sensing production mode.

In order to eat this crab, the author searched a lot of gui and some scripts based on the secondary development of Sam from github, and finally successfully deployed segment anything to local use through experiments, and fully experienced the powerful algorithm. Next, we will talk about how to deploy the algorithm to local use.

The following introduction was generated using ChatGPT: napari is a Python-based interactive image browser that allows you to easily visualize large image data, 3D data, and other types of data, while also providing some simple interactive functions such as zooming, panning, rotating, selecting, and marking. Napari is extensible so users can add their own layers and plug-ins, making it more flexible and powerful. It also has good performance and memory management to handle very large data sets. Napari is an open source project whose source code is available on GitHub. It uses many other Python libraries, including NumPy, PyQt5, VisPy, and SciPy.

It has to be said that ChatGPT greatly liberates hands, some code and package interpretation and use, it can be very accurate to give out, save a lot of code time.

However, although this package is introduced, there is no need to install it separately, because the author of this package has developed a UI interface based on Segment anything in just four days, which is the focus of this article.

Napari-segment-anything package

InterJoOkuma facing napari and SAM greatly saves me time to communicate with ChatGPT (I usually use ChatGPT to build UI interfaces, because I am not very familiar with PyQt5 functions), but the big man has finished writing codes, but readme has written very little. I also made some suggestions on the issue, but it will take time to change them later, so I will write a complete process here.

The home page of the package is there this, and the required environment and installation commands are clearly written, but let’s go over them here:

1

pip install napari-segment-anything

It is highly recommended to use the above command to install in the new conda environment, which will save a lot of worry.

In addition, the package is based on QT for secondary development, and naturally PyQt5 needs to be installed, which is simple:

1

pip install pyqt5

The above commands are operated in the environment created by conda:

1

conda create -n Sam python=3.10

Yes, the python version of all the code this time is based on 3.10. If there is any difference, you can email me.

The installation progress is very slow. This is because the main algorithm segment anything involves more packages, and it takes time to download the new environment. In addition, if the download is slow, you can switch to the Tsinghua source. The Tsinghua mirror download is very fast.

If I use a proxy, I shouldn’t need to say more.

Run the code locally

The installation is basically done in the last chapter, except for a segment anything parameter file, which will be automatically downloaded after running the code described in this chapter, mainly set by the napari author in the code.

After the above contents are installed, run the following code to run the SAM UI program:

1
2
3
4
5
6
7


from napari_segment_anything import SAMWidget
import napari

viewer = napari.Viewer()
sam_widget = SAMWidget(viewer)
viewer.window.add_dock_widget(sam_widget, name="name", area="right")
napari.run()

After running the code for the first time, the PTH parameter file for segment anything is automatically downloaded as described earlier:

Downloading https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth to C:\Users\User.cache\napari-segment-anything\sam_vit_h_4b8939.pth … Download progress: 5.1% (124.1/2445.7 MB)

When the download is finished, you can run the interface smoothly. Of course, if you have downloaded the parameter file in advance, you can go into napari_segment_anything the source code to modify the path. Locate the folder in the conda virtual environment and modify utils.py the functions in get_weights_path the file to:

1
2
3
4


    weight_path = cache_dir / weight_url.split("/")[-1]

    # Download the weights if they don't exist
    if not weight_path.exists():

This code is modified into the content you want, remember to modify, if not judge, here because the author did not modify the demand, there is no dynamic source code. Of course, if the code level is limited, you can communicate with me by email.

Use

At this point, there should be no problem, the interface runs napari_segment_anything the same as the home page, the detailed use of the author still needs further exploration, but the general logic is clear. For example, import pictures:

Operationally, I also followed the video on the github homepage and clicked a few times, no problem:

If there is a more detailed operation guide, readers can communicate with me after reading it.

end

Subversive cv big model, the day finally arrived, compared with the traditional object-oriented segmentation, it seems that this way of operation and work, more optimistic, we wait and see.