The recent proposed segment anything model (SAM) has become a landmark foundation model in the field of vision, owing to its capability to segment any object within the given images. Despite its crucial role as a foundation step for various high-level vision tasks, it demands substantial computation resources. This requirement has become a bottleneck for