how to retrain Tensorflow Inception model to add new classes on Ubuntu – Source Dexter

41 Comments

  1. RROY
    June 7, 2017 @ 7:23 pm

    Nice write up. Can you please add the steps on how to do the validation of the trained model?

    Thanks,
    Rupam Roy

    Reply

  2. RROY
    June 7, 2017 @ 7:57 pm

    I am getting the below error when trying to o some validation tests :

    KeyError: “The name ‘softmax:0’ refers to a Tensor which does not exist. The operation, ‘softmax’, does not exist in the graph.
    any idea what is causing this?

    Reply

    • akshay
      June 10, 2017 @ 9:47 am

      Hi,

      So that error is occurring because this new model which has been trained contains a softmax later in the end that helps in classification. what script are you using for validating the trained models ?

      Reply

      • Wayne Cheng
        June 22, 2017 @ 2:00 pm

        I have the same issue. After I use the retrain.py to have the new pb and pbtxt file. I use the classifier.py to validate the image file. I got the error: “The name ‘softmax:0’ refers to a Tensor which does not exist. The operation, ‘softmax’, does not exist in the graph.” . And I do get the “Converted 2 variables to const ops.” prompt.

        Reply

        • akshay
          June 22, 2017 @ 2:08 pm

          can you show me the full command that you are using to run the classifier.py script?

          Reply

          • Wayne Cheng
            June 22, 2017 @ 2:23 pm

            The full command is: “python3 classifier.py –image_file ./image/test.jpg”

        • admin
          June 28, 2017 @ 10:46 am

          Somehow I am not getting this error. I have added another file to the git repo “retrain_new.py”. this new script has been updated to run with the newer version of tensorflow.

          Another thing is that the classifier.py should not be used to test the new model that you have retrained. This is only to set up a simple imagenet classifier and run it. What you would have to do is create a tensorflow session by loading your new model and getting the predictions in a to session like : top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]

          I will upload a full script by this weekend.

          Reply

          • Wayne Cheng
            July 7, 2017 @ 12:24 pm

            Awesome! The latest version is worked. Thanks for your quickly reply and update.

          • admin
            July 7, 2017 @ 12:35 pm

            You’re welcome

  3. Wayne Cheng
    June 22, 2017 @ 2:20 pm

    ” python3 retrain.py –model_dir ./inception –image_dir ../image –input_binary=true” . After get the “Converted 2 variables to const ops.” , I copy the /tmp/output_graph.pb and /tmp/output_labes to ./inception as classify_image_graph_def.pb and imagenet_2012_challenge_label_map_proto.pbtxt. Then run the classifier.py, and get the error.

    Reply

  4. Sang
    June 27, 2017 @ 4:03 pm

    Image directory ‘~\fabrics’ not found.
    Traceback (most recent call last):
    File “retrain.py”, line 970, in
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
    File “C:\Users\itagi\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py”, line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
    File “retrain.py”, line 716, in main
    class_count = len(image_lists.keys())
    AttributeError: ‘NoneType’ object has no attribute ‘keys’

    I am getting the following error when running on windows 10

    Reply

    • admin
      June 27, 2017 @ 4:35 pm

      the structure “~/fabrics” is used for Linux. Could you try giving the full path of the folder and running it again?

      Reply

      • Sang
        June 28, 2017 @ 11:09 am

        It worked .Thank you very much.
        How do we provide a single image and know which class it belongs to.

        Reply

        • admin
          June 28, 2017 @ 11:23 am

          You’re welcome. As of now that script to test the classifier, I haven’t uploaded. I’ll be uploading it by this weekend and you can use that .

          Until then, you can try to create a the session and use the prediction on the model.

          There is another script “classifier.py” in the GitHub repo but that will only work with imagenet model and not the retrained model.

          Reply

          • Shashi
            June 28, 2017 @ 11:43 am

            Which model is best for image classifying ? Alexnet or Inceptionv3

          • admin
            June 28, 2017 @ 12:20 pm

            Definitely Inception V3. It’s newer and has better accuracy fro image classification. As of now there is Resnet V2 which is better performing than inception V3. But you would have to use the new TF slim to work with it.

          • Sang
            June 29, 2017 @ 10:20 am

            How to classify a single image on two categories?Like example i want to categorize a flower as both rose and red ,or rose and white rose.

          • admin
            June 29, 2017 @ 8:24 pm

            The way classifier works is that gives as result the class which has the highest probability. What you want to achieve is a bit tricky. I can think of three ways to do it.
            1. you can create an exhaustive set of classes in case there aren’t many. This way, you will be creating one class for “red rose”, “white rose”, “white tulips”, “pink tulips”, etc and training a classifier. I feel this would be the easiest but it isn’t scalable if there are a lot of classes and combinations.

            2. Another simple way would be to have separate classes for each flower and its variation. That is one classifier with classes like “Rose”, “tulip”, “red rose”, “white rose”, “pink tulip”, etc. Then, when you perform the classification and picking only the result with the highest probability, you can get all the probability value and write a custom logic which picks the one with the highest probability among flower classes (rose, tulips, etc) and among its corresponding colors, pick the highest probability among ( red rose, white rose, etc)

            3. another way would be to have multiple classifiers in place. This is a more cleaner solution. One classifier for roses, tulips, etc and another one to identify the colors, red, white, pink etc.

  5. Shashi
    June 30, 2017 @ 12:12 pm

    Do you have a video in you tube which explains the retrain.py code briefly.Actually i am new to deep learning .Or else can you give a short summary about how the code works .Can you also tell how it takes the datasets for training and testing in retrain.py and prediction of training accuracy and those parameters.
    Thank you

    Reply

    • admin
      June 30, 2017 @ 3:28 pm

      Hi,
      So there isn’t a video yet. But I am planning to create one and hopefully upload it soon. I will mail you once the video is up so that you can learn via that. Giving a summary of the code requires a lot of time as many things are involved in that. Due to that, I would have to write another blog post and as of now, that would take a long time. But I will create and upload a video for sure.

      Reply

  6. Sang
    July 3, 2017 @ 9:58 am

    Hello there!You have said to upload a script for classifier .If you have uploaded can you give me the link.Thank you!

    Reply

    • admin
      July 3, 2017 @ 10:45 am

      Hi, I Have updated the blog post. See section “Update 1: Added code to test the retrained model” to know how you can test it. I have added the code to Github repo as well.

      Reply

      • Sang
        July 3, 2017 @ 4:52 pm

        C:\Users\i\tfClassifier>python retrain_model_classifier.py D:\CSDAML\Project_fabric\Dataset\img01.jpg
        Traceback (most recent call last):
        File “retrain_model_classifier.py”, line 12, in
        in tf.gfile.GFile(“./labels.txt”)]
        File “retrain_model_classifier.py”, line 11, in
        label_lines = [line.rstrip() for line
        File “C:\Users\i\Anaconda3\lib\site-packages\tensorflow\python\lib\io\file_io.py”, line 170, in __next__
        return self.next()
        File “C:\Users\\Anaconda3\lib\site-packages\tensorflow\python\lib\io\file_io.py”, line 164, in next
        retval = self.readline()
        File “C:\Users\Anaconda3\lib\site-packages\tensorflow\python\lib\io\file_io.py”, line 133, in readline
        self._preread_check()
        File “C:\Users\Anaconda3\lib\site-packages\tensorflow\python\lib\io\file_io.py”, line 75, in _preread_check
        compat.as_bytes(self.__name), 1024 * 512, status)
        File “C:\Users\Anaconda3\lib\contextlib.py”, line 66, in __exit__
        next(self.gen)
        File “C:\Users\Anaconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py”, line 466, in raise_exception_on_not_ok_status
        pywrap_tensorflow.TF_GetCode(status))
        tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: ./labels.txt : The system cannot find the file specified.

        i am getting this error when i run the retrain classifier in windows

        Reply

        • admin
          July 3, 2017 @ 5:05 pm

          That is because the path of the file given isn’t according to Windows standard. In the script “retrain_model_classifier.py”, you have to change “./lables.txt” to the full path of where that file is present.
          Do the same for the “./output.pb” as well. Replace that by the full path of the file in Windows. It will then work.

          Reply

          • Sang
            July 3, 2017 @ 7:17 pm

            Where can I find these two files ? Are they in your tfClassifier directory or in Anaconda libraries ? Because I could not find labels.txt file in your directory.

          • admin
            July 3, 2017 @ 7:20 pm

            These two files are the output of your retraining script. When your the retrain.py, labels.txt and output.pb should be generated. It will be stored in the same file as the retrain.py directory. Once you have those two, you can run the script for testing you’re model.

  7. Sang
    July 3, 2017 @ 8:19 pm

    When I run this retrain.py code, I got output as total test accuracy in command prompt window but these two file have not generated in my directory though I have file called output but it is not output.pb type . Is this because I am working on windows ?

    Reply

    • admin
      July 4, 2017 @ 8:05 pm

      If you got that in command prompt, the file would have been stored. You can check the code in retrain.py as to where the files are being stored. You can also change the path so that you know where exactly it will be stored

      Reply

  8. jordan
    July 5, 2017 @ 12:13 pm

    Hi,
    thanks for this tutorial ! i managed to get it working but in my labels.txt i am not getting the node names.

    instead of:
    n00004475 organism, being
    n00005787 benthos

    i am getting just:
    tulips
    roses

    and when i run test the classifiers, it returns me tulips when i give it roses and vice versa,
    could this be due to the missing node names ?

    Reply

    • admin
      July 5, 2017 @ 12:28 pm

      Hey Jordan. The reason why you might be getting roses and tulips is because the folder names in your image directory might be named roses and tulips instead of organism, being, etc.

      The classification being wrong might be because the model is not trained for enough number of steps. For that you can increase the –how_many_training_steps 500 to say around 4000 while running the command. This way you might get better accuracy

      Reply

  9. Mich
    July 11, 2017 @ 4:59 pm

    I get this error when I try to retrain my own classifier on Ubuntu 16.04:

    Traceback (most recent call last):
    File “retrain.py”, line 967, in
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
    File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py”, line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
    File “retrain.py”, line 750, in main
    train_writer = tf.train.SummaryWriter(FLAGS.summaries_dir + ‘/train’, sess.graph)
    AttributeError: ‘module’ object has no attribute ‘SummaryWriter’

    Any idea about this problem? Thank you

    Reply

    • admin
      July 11, 2017 @ 5:20 pm

      Hi, I think that your Tensorflow version is older.
      You can perform “pip install tensorflow –upgrade” for python 2 or use “pip3 install tensorflow –upgrade” for python3 to upgrade to latest tensorflow version and try again

      Reply

      • Mich
        July 11, 2017 @ 6:06 pm

        Thanks for the answer. This didn’t work but I solved the problem by changing:

        train_writer = tf.train.SummaryWriter(FLAGS.summaries_dir + ‘/train’, sess.graph)
        validation_writer = tf.train.SummaryWriter(FLAGS.summaries_dir + ‘/validation’)

        by

        train_writer = tf.summary.FileWriter(FLAGS.summaries_dir + ‘/train’, sess.graph)
        validation_writer = tf.summary.FileWriter(FLAGS.summaries_dir + ‘/validation’)

        Reply

        • admin
          July 11, 2017 @ 6:11 pm

          Awesome ..

          Reply

  10. Shashank Bongale
    July 16, 2017 @ 1:40 pm

    ./labels.txt has not been created.

    Reply

    • admin
      July 16, 2017 @ 1:45 pm

      It should be created. This code is working perfectly on both windows and linux. Please make sure that you have given the path correctly while starting the script.

      Reply

      • Shashank Bongale
        July 16, 2017 @ 1:47 pm

        WHICH PATH

        Reply

        • admin
          July 16, 2017 @ 1:58 pm

          By default, your labels.txt is stored in /tmp/output_labels.txt . Now if you aren’t on ubuntu, then while running the script, you can give another option ” –output_labels

          Reply

          • Shashank Bongale
            July 18, 2017 @ 1:59 pm

            okay thank you very much.Its working now.

          • admin
            July 18, 2017 @ 2:06 pm

            You’re welcome

  11. Fin
    August 10, 2017 @ 10:15 pm

    Hi I get this error when I try to retrain
    Traceback (most recent call last):
    File “retrain_new.py”, line 1019, in
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
    File “/PythonCode/tfLabs/lib/python3.5/site-packages/tensorflow/python/platform/app.py”, line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
    File “retrain_new.py”, line 872, in main
    f.write(output_graph_def.SerializeToString())
    File “/PythonCode/tfLabs/lib/python3.5/site-packages/tensorflow/python/lib/io/file_io.py”, line 101, in write
    self._prewrite_check()
    File “/PythonCode/tfLabs/lib/python3.5/site-packages/tensorflow/python/lib/io/file_io.py”, line 87, in _prewrite_check
    compat.as_bytes(self.__name), compat.as_bytes(self.__mode), status)
    File “/usr/lib64/python3.5/contextlib.py”, line 66, in __exit__
    next(self.gen)
    File “/PythonCode/tfLabs/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py”, line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
    tensorflow.python.framework.errors_impl.FailedPreconditionError: ./output_dir/

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *