Skip to content
This repository has been archived by the owner on Apr 14, 2022. It is now read-only.

High memory usage #832

Closed
gramster opened this issue Mar 27, 2019 · 100 comments
Closed

High memory usage #832

gramster opened this issue Mar 27, 2019 · 100 comments

Comments

@gramster
Copy link
Member

@suiahaw commented on Tue Mar 26 2019

Issue Type: Bug

Some questions about the python plugin encountered when building python library indexes

I strongly hope that the python plugin does not read the information into memory in real time when creating the python library index, but instead saves the index file in order to speed up the time and reduce memory overhead.

The python library is really too big. Sometimes I have to wait a few minutes for writing a small amount of code.

Extension version: 2019.3.6139
VS Code version: Code 1.32.3 (a3db5be9b5c6ba46bb7555ec5d60178ecc2eaae4, 2019-03-14T23:43:35.476Z)
OS version: Windows_NT x64 10.0.17763

System Info
Item Value
CPUs Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz (4 x 2712)
GPU Status 2d_canvas: enabled
checker_imaging: disabled_off
flash_3d: enabled
flash_stage3d: enabled
flash_stage3d_baseline: enabled
gpu_compositing: enabled
multiple_raster_threads: enabled_on
native_gpu_memory_buffers: disabled_software
rasterization: enabled
surface_synchronization: enabled_on
video_decode: enabled
webgl: enabled
webgl2: enabled
Memory (System) 15.86GB (8.60GB free)
Process Argv
Screen Reader no
VM 46%

@algo99 commented on Wed Mar 27 2019

I can confirm, on Windows 10.

After update (yesterday) of Python Language Server and/or vscode_python extension VSCode not more usable. See the disk I/O and memory consumption:

vscode_python_language_server

The extension proceeds after VSCode start:

Analyzing in background, 6879 items left ...

and this process does not stop.
I tried several times to restart VSCode, but it does not help.


@algo99 commented on Wed Mar 27 2019

Just noted in Python output pane:

...
[Info  - 09:17:10] Microsoft Python Language Server version 0.2.31.0
[Info  - 09:17:10] Initializing for c:\3rd\WinPython\python-3.7.2.amd64\python.exe
Traceback (most recent call last):
  File "C:\Users\zkr\.vscode\extensions\ms-python.python-2019.3.6139\languageServer.0.2.31\scrape_module.py", line 1497, in <module>
    state.collect_second_level_members()
  File "C:\Users\zkr\.vscode\extensions\ms-python.python-2019.3.6139\languageServer.0.2.31\scrape_module.py", line 944, in collect_second_level_members
    self._collect_members(mi.value, mi.members, substitutes, mi)
  File "C:\Users\zkr\.vscode\extensions\ms-python.python-2019.3.6139\languageServer.0.2.31\scrape_module.py", line 958, in _collect_members
    raise RuntimeError("failed to import module")
RuntimeError: failed to import module
Traceback (most recent call last):
  File "C:\Users\zkr\.vscode\extensions\ms-python.python-2019.3.6139\languageServer.0.2.31\scrape_module.py", line 1497, in <module>
    state.collect_second_level_members()
  File "C:\Users\zkr\.vscode\extensions\ms-python.python-2019.3.6139\languageServer.0.2.31\scrape_module.py", line 944, in collect_second_level_members
    self._collect_members(mi.value, mi.members, substitutes, mi)
  File "C:\Users\zkr\.vscode\extensions\ms-python.python-2019.3.6139\languageServer.0.2.31\scrape_module.py", line 958, in _collect_members
    raise RuntimeError("failed to import module")
RuntimeError: failed to import module
Traceback (most recent call last):
  File "C:\Users\zkr\.vscode\extensions\ms-python.python-2019.3.6139\languageServer.0.2.31\scrape_module.py", line 1489, in <module>
    state.initial_import(sys.argv[2])
  File "C:\Users\zkr\.vscode\extensions\ms-python.python-2019.3.6139\languageServer.0.2.31\scrape_module.py", line 872, in initial_import
    mod = __import__(self.module_name)
  File "c:\3rd\winpython\python-3.7.2.amd64\lib\site-packages\cvxopt\__init__.py", line 50, in <module>
    import cvxopt.base
ImportError: DLL load failed: Das angegebene Modul wurde nicht gefunden.

@dilzeem commented on Wed Mar 27 2019

I am also getting the same issue as well on Windows 10.

It only happens once I start debugging some code.
It renders VScode unusable.


@suiahaw commented on Wed Mar 27 2019

It only happens when the python plugin loads the python library, which requires a lot of CPU and memory. And these information that is loaded into memory is estimated to be used for IntelliCode.


@ray306 commented on Wed Mar 27 2019

The same problem happens today. I think it's a bug of new Python plugin just released yesterday (March 26).


@dilzeem commented on Wed Mar 27 2019

Possibly a duplicate of of this in the python language server:

#450


@suiahaw commented on Wed Mar 27 2019

Possibly a duplicate of of this in the python language server:

Microsoft/python-language-server#450

I also think that the issues area of the python plugin has a large part of similar content related to such a bug. It seems that this bug is very disturbing programming and needs to be solved urgently. ;-)

@MikhailArkhipov
Copy link

MikhailArkhipov commented Mar 27, 2019

Could you post specific memory consumption and approx time time to when analysis completes. Generally you don't have to wait until it completes as it is a background task that collects information in stages. 4GB consumption may happen, eventually memory is released.

#450 was about 30GB+ or runaway consuming all memory, so it is not a duplicate. Number of tasks is limited so process should not be consuming all CPU, limit is ~40% or so.

If we can clone projects for investigation it would be helpful too.

Thanks.

@algo99
Copy link

algo99 commented Mar 27, 2019

It was strange enough for me.

  1. I work normally with WinPython. As I wrote above the analysis task do not stop (also over night). At the end VSCode just crashed. All the time the GUI was not usable at all.

  2. Today I installed Anaconda3, and then tried again, and VSCode works with Python as usual, just these messages in output Python pane still here:

Traceback (most recent call last):
  File "C:\Users\zkr\.vscode\extensions\ms-python.python-2019.3.6139\languageServer.0.2.31\scrape_module.py", line 1489, in <module>
    state.initial_import(sys.argv[2])
  File "C:\Users\zkr\.vscode\extensions\ms-python.python-2019.3.6139\languageServer.0.2.31\scrape_module.py", line 872, in initial_import
    mod = __import__(self.module_name)
  File "c:\3rd\anaconda3\lib\site-packages\scipy\__init__.py", line 62, in <module>
    from numpy import show_config as show_numpy_config
  File "c:\3rd\anaconda3\lib\site-packages\numpy\__init__.py", line 140, in <module>
    from . import _distributor_init
  File "c:\3rd\anaconda3\lib\site-packages\numpy\_distributor_init.py", line 34, in <module>
    from . import _mklinit
ImportError: DLL load failed: Das angegebene Modul wurde nicht gefunden.

Looks like the server can not import some DLL related to numpy now.
When you look at my report above (with WinPython), there was some DLL related to cvxpy.

Could it be some ABI issue ?
Because I noted yesterday, that I updated my Visual Studio Build Tools 2017 and decided reinstall cvxpy package (actually rebuilding related DLLs)

pip install -U cvxpy

And after restarting VSCode (yesterday with WinPython) there was not messages with cvxpy, but many related to another packages like "DLL load failed"

@dilzeem
Copy link

dilzeem commented Mar 27, 2019

Some further information:

I did have consumption at 22GB+, with CPU usage at around 15-25%.

It happens when I open any python file. This is after 30 seconds after opening a simple python file, which just imports pandas.

image

It was analyzing a few files indicated at the bottom of the toolbar:
image

After a few minutes it has stabilized to this:
image

I believe the issue is when I have virtual environments folder within the workspace, it causes a huge spike in RAM usage by analyzing over 7000+ files.

Version: 1.32.3 (user setup)
Commit: a3db5be9b5c6ba46bb7555ec5d60178ecc2eaae4
Date: 2019-03-14T23:43:35.476Z
Electron: 3.1.6
Chrome: 66.0.3359.181
Node.js: 10.2.0
V8: 6.6.346.32
OS: Windows_NT x64 10.0.17134

@suiahae
Copy link

suiahae commented Mar 27, 2019

After analysis, I think memory consumption
is caused when the python plugin creats the python library index by reading the information into memory in real time. (The memory consumed by the Python library index is for the IntelliSense, I think)

Further, memory consumption and approx time time to when analysis completes depends on the size of the extra library installed for Python. (In other words, the bigger the Python library, the greater the memory occupies, and the more time required)
For anaconda installed by default, memory consumption is about 6GB and approx time time to when analysis completes is about 15 mins.

I can debug existing code before it's done, but I can't use IntelliSense when writing code.

Thanks!

@ly-cultureiq
Copy link

RAM consumption for me hit 7.7GB before running out of memory.
Amount of modules remaining to analyze continuously rises as well

@jakebailey
Copy link
Member

@algo99 The DLL importing thing was technically a problem in the old version, but just wasn't printed. The currently released version does print it, but we merged a PR which when released will re-hide it (see #823).

@suiahaw The language server doesn't keep an on-disk version of the analysis yet (and never has, apart from scraped information from compiled libraries that cannot be directly parsed and analyzed). That is #472.

@gramster
Copy link
Member Author

gramster commented Mar 27, 2019

@dilzeem, import pandas means the LS will analyze pandas and its transitive dependencies (numpy etc), so there is a lot going on at first. Taking a couple of minutes for that is not unexpected, but after that you should have many completions available (and many which we simply did not have at all in the previous language server). Just want to set expectations around expected behavior versus actual bugs. We want to track down bugs like memory leaks and analysis not completing, and we will make improvements in the future around caching and type stubs to speed up analysis of common packages, but its also the case that having some initial cost to analysis is to be expected when importing complex packages.

@suiahae
Copy link

suiahae commented Mar 27, 2019

@algo99 The DLL importing thing was technically a problem in the old version, but just wasn't printed. The currently released version does print it, but we merged a PR which when released will re-hide it (see #823).

@suiahaw The language server doesn't keep an on-disk version of the analysis yet (and never has, apart from scraped information from compiled libraries that cannot be directly parsed and analyzed). That is #472.

If no an on-disk version of the analysis exists, this means that whenever I write a Python program using vscode, I would have to suffer extra memory and time overhead. This will make me unable to use vscode as my main development tool. an on-disk version of the analysis is necessary. On the other hand, the extra disk overhead brought by the analysis can be solved by database compression and periodic reminding users to clean up.

@MikhailArkhipov
Copy link

@dilzeem - 6GB peak is not unusual at the moment considering that used to be 30GB+ in #450, 20 appears excessive though.

@dilzeem
Copy link

dilzeem commented Mar 27, 2019

@MikhailArkhipov, @gramster thanks for the quick response.

Okay, it was surprising as I didn't have any issues like this until the March Python Extension update. So it was unexpected behavior.

Okay I will see if I can recreate the 25GB+ scenario tomorrow, and give you an update. The project I was using had a lot of modules, mainly scikit-learn, pandas, dash and gensim.

The computer actually crashes when I opened two similar projects in different windows.

@jakebailey jakebailey changed the title Huge CPU and Memory consumption caused by design problems High memory usage Mar 27, 2019
@micw523
Copy link

micw523 commented Mar 27, 2019

I'm not sure if this is an old bug, since I'm also seeing this behavior only after the most recent update.
screenshot

@ranka47
Copy link

ranka47 commented Mar 28, 2019

I started facing this issue in the last week. My laptop has 8GB RAM and the OS is Ubuntu 16.04. After opening a python file, the RAM usage shoots up until it is completely eaten. The only way I am able to use VSCode is by disabling the extension.
I tried disabling python.jediEnabled and tried removing it from .vscode folder and reinstalling it. However, the issue still persists.

@dilzeem
Copy link

dilzeem commented Mar 28, 2019

Just an update:

Just importing 4 libraries. Initially got this with analyzing in background at the bottom which is frozen with 9709 items left:

image

After it has been analyzing for approx 30 mins it is still 20+ GB Ram Usage. Also the number of items left has increased to 10479.

image

After 40 mins of analyzing VScode crashed.

@suiahae
Copy link

suiahae commented Mar 28, 2019

Just an update:

Just importing 4 libraries

image

Analyzing in background at the bottom is frozen ad 9709 items left.
It has been analyzing for approx 30 mins and still 20+ GB Ram Usage. Also the number of items left has increased to 10479.

image

I will keep it running, and will see it if reduces, and at approx how long it took.

After experimenting, I think memory consumption and approx time time to when analysis completes depends on the size of the extra library installed for Python , instead of that how many libraries be imported.

See details: #832 (comment)

@zhen8838
Copy link

I just import Tensorflow 13.1 ... when analysis it cost 10 Gb memory, when finshed it cost 8 Gb memory.
OS: ubuntu 18.04
vscode version:

1.32.3
a3db5be9b5c6ba46bb7555ec5d60178ecc2eaae4
x64

@dilzeem
Copy link

dilzeem commented Mar 28, 2019

After experimenting, I think memory consumption and approx time time to when analysis completes depends on the size of the extra library installed for Python , instead of that how many libraries be imported.

See details: #832 (comment)

I am not so sure that is the case.

When I import pandas only, VScode can handle it, after 5 mins it uses 6GB of RAM
When I import 3 or 4 libraries, VScode crashes, and maxes out my RAM (20 GB+)

@CaselIT
Copy link

CaselIT commented Mar 28, 2019

Same issue here, high memory consumption and it does not seem to end the background analysis.

It is completely unusable since the March update.

The previous version with the same project worked mostly fine regarding memory consumption and resource usage during analysis

@gramster
Copy link
Member Author

gramster commented Mar 28, 2019 via email

@nbara
Copy link

nbara commented Mar 28, 2019

Definitely also occurs on Mac (10.14.4).

Here after just a few minutes of leaving Code idly in the background:

Screenshot 2019-03-28 at 14 08 17

Code is already eating up 10 Gb of RAM, and it's still going up.

Let me know if I can provide something more useful than these screens.

@CaselIT
Copy link

CaselIT commented Mar 28, 2019

@gramster I'm on windows 10
The screenshot from @zhen8838 #832 (comment) seems on ubuntu

@ly-cultureiq
Copy link

I'm running linux (fedora) - and seeing the issue.

@suiahae
Copy link

suiahae commented Mar 28, 2019

@gramster
The following is my personal inference (It is translated by Google Translate from Chinese):

The principle of the Microsoft Python analysis engine is to load all the information of the installed python module into the memory at one time. After the full load, the code intelligent reminder (IntelliCode) is fast, but the memory occupancy rate is very large.

But Jedi works by loading information when a piece of information is needed. This will reduce the long-term memory usage, but the code hints (IntelliCode) will be slightly delayed and the CPU usage will be high when loading.

I hope this will help you.

@martinjohndyer
Copy link

I'm on Ubuntu and get the same memory usage problem (see #831, a very linked if not duplicate issue).

@gramster
Copy link
Member Author

gramster commented Mar 28, 2019

Hi all

There are three primary things of interest here:

  • peak memory usage
  • analysis time
  • analyses that never stall or never complete

We should push out an update later today which will help address the first issue. It will increase analysis time on systems with ample RAM but if you are on a memory constrained system may make things better by reducing paging. We should be able to make further improvements to memory consumption over the next few days.

Regarding analysis time, yes, longer term we plan to implement caching and keep making improvements (e.g. more extensive use of type stubs). The older language server didn't use caching so this isn't a regression, and we found that with packages like numpy taking typically about 10 seconds to analyze that analysis time was actually much improved. It may be that people are misattributing the time the old language server took; it is much more obvious now as we show the items remaining count.

Regarding the third point, those are clearly bugs that we need to track down and address, so reports of this and any repro steps will be very helpful.

On the other hand, this new language server does have a number of improvements in the completions it can offer, and we wanted to get it out to get feedback. It looks like we may have been too hasty and I apologize for that (the decision was ultimately mine so please direct your anger at me and not at any other members of the team, who have been working very hard on this). For those who cannot be patient while we try to make this right, we suggest switching to jedi, at least in the short term, but I hope that we can make things much better within days.

Thanks.

@MikhailArkhipov
Copy link

0.2.34 was published to stable and it should limit memory consumption (more improvements to come in the next few days).

You can either reload the window and the extension should download new LS, or, if it doesn't, use Open Extensions Folder command and delete languageServer* subfolder in the Python extension, then restart VS Code.

@MikhailArkhipov
Copy link

@suiahaw - although it is technically possible to delay-load some modules, it only helps partially. Jedi is primarily completion engine while LS also handles multiple other cases such as live syntax check, linting, find references, rename, etc. Therefore is collects information that otherwise handled by rope, ctags, and so on.

Jedi is also known to consume tens of GB on large libraries (see microsoft/vscode-python#263 and microsoft/vscode-python#744 - extension kills Python process if it exceeds certain limit of memory.

Now, the issue is not actually in the analysis. The issue is in imports dependency graph we build and that is going to be fixed in the next few days.

@suiahae
Copy link

suiahae commented Mar 28, 2019

@suiahaw - although it is technically possible to delay-load some modules, it only helps partially. Jedi is primarily completion engine while LS also handles multiple other cases such as live syntax check, linting, find references, rename, etc. Therefore is collects information that otherwise handled by rope, ctags, and so on.

Jedi is also known to consume tens of GB on large libraries (see Microsoft/vscode-python#263 and Microsoft/vscode-python#744 - extension kills Python process if it exceeds certain limit of memory.

Now, the issue is not actually in the analysis. The issue is in imports dependency graph we build and that is going to be fixed in the next few days.

Okay, I got it. I hope my error analysis does not bring you extra work.

Thanks, and I wish the problem can be resolved as soon as possible.

@jakebailey
Copy link
Member

In VS Code, use the command palette to open your settings as JSON, and add it to the configuration.

@MikhailArkhipov
Copy link

@MrTango it is not a public setting, so it will get squiggled.

@caparomula
Copy link

I found that the python language server, running on RHEL7, seemed to have a thread leak: it kept spawning new threads until it reached my per-user process limit and started failing. After configuring VS Code to use the python analysis beta channel, it now spawns a reasonable number of threads and does not keep growing.

@jakebailey
Copy link
Member

0.2.74 was promoted to stable; is that the version that had the leak before you switched? (Beta is now 0.2.76 which contains another change, though I'm not sure if it would have affected what you describe.)

@caparomula
Copy link

@jakebailey I don't actually know which version I was using prior to setting the beta channel. But I had just updated VS code several days ago, and switched to the beta channel just about half an hour ago.

@MikhailArkhipov MikhailArkhipov removed this from the May 2019.1 milestone May 14, 2019
@VelizarVESSELINOV
Copy link

Using: "python.analysis.downloadChannel": "beta"

image

image

@gramster
Copy link
Member Author

Hey Velizar, can you try to post a repo for us? From our telemetry, the 90th percentile memory use per session is now about 1.3GB; the 99th percentile is under 10GB, so 30GB is very unusual, and it would be helpful to understand what is going on in your environment to cause that. We have been working hard on bringing memory use down but there may still be bugs we haven't squashed that are occurring with some packages.

@memeplex
Copy link

Well I've just reported that importing pandas is using half a GB of memory and it seems that, by the standards here, that's quite low :). Anyway, there is the startup time issue too. Each time I restart code it takes 30 seconds to 1 minute to get the language server functional for those big namespaces again. Jedi works quite better in this respect (lower initial latency and lower memory usage), despite working quite worse in other fronts.

@tejasvi
Copy link

tejasvi commented Jun 6, 2019

I receive chrome://global/content/elements/notificationbox.js:[some number] unresponsive error before the server is out of more RAM or swap. Killing the server quickly after starting vscode seems to fix the issue.

@tejasvi
Copy link

tejasvi commented Jun 6, 2019

The code currently I'm working with:
https://gist.github.com/tejasvi/11777404af5fa5ab57fd7918b4dcf031

@tejasvi
Copy link

tejasvi commented Jun 6, 2019

That is not normal, that is a bug. We have been working for many weeks on performance improvements, including memory usage. At the latest revision, analyzing tensorflow on my machine only uses 400 MB, as opposed to the multiple gigabytes it used to take. If you have a project which reproduces it, we'd love to test it and figure out why that would occur.

Also, try switching up to the beta release, which has many more improvements not yet in stable.

"python.analysis.downloadChannel": "beta"

Where am I supposed to put the setting? It isn't recognized in ctrl+,

@jakebailey
Copy link
Member

In your user settings.json, which you can find by opening your command palette and searching for "JSON".

@MikhailArkhipov
Copy link

MikhailArkhipov commented Jun 7, 2019

#1133 and #1134 should have improved the case. #1134 is a significant change so version was bumped to 0.3. We just published it to daily and we will let it bake for 2-3 days. If nothing goes wrong, then we'll push it to beta and then to stable.

I am going to close this issue. Please open separate cases for specific sets of libraries or repro steps, like #1157.

Thanks!

@CaselIT
Copy link

CaselIT commented Jul 9, 2019

I'm having again high memory usage problems on 0.3.20.

These seems to be a leak somewhere: it finishes analyzing fine using about 1gb.
Then over time it will increase to 10gb+.

Should I open a new issue?

@jakebailey
Copy link
Member

Yes, please.

@CaselIT
Copy link

CaselIT commented Jul 9, 2019

@jakebailey I've opened issue #1298.

@hoveidar
Copy link

hoveidar commented Aug 2, 2019

Running my python code in the terminal and editing it in vscode. After aborting the program and even closing the corresponding vscode window, `Microsoft.Python.LanguageServer' fluctuates between 500 MB to 1.2 GB on my memory. I'm using version 0.3.39 on mac btw.

@MikhailArkhipov
Copy link

@hoveidar - this consumption is not unusual if you use large libraries, like, say, use tensoflow. Also, this number is not necessary memory that is held. Managed code like C# or Java releases memory when there is memory pressure in the OS, not immediately, like C++. Thus, if machine has 8GB RAM with 2GB free it may show smaller consumption number than machine with 64GB with 32GB free since runtime simply does not see reasons to release the memory pool back to OS.

This thread is closed so please report specific problems by opening new issues. Each case is different and may need separate investigation by different developers.

@bluebrown
Copy link

bluebrown commented Aug 5, 2019

Not sure why you guys defending this so hard. Its obviously broken!

Frequently it gets stuck scanning for dependencies. I have to restart my cheap laptop after max 30 minutes as it almost freezes.

I thought this is related to the fact that this an old cheap machine. However, now I am at work with a newer PC. I just had to set myself to technical issue and restart the machine as it was about to freeze completely. At least the session was longer the 30 minutes before it happened.

Still, its undeniable that this is related to the python language server.

image

@jakebailey
Copy link
Member

This isn't a defense or a statement that no issues exist; we're asking that this 4 month old closed issue about a different bug (but with potentially similar symptoms) not be used as the place to talk about a new issue. The analysis not completing may or may not have something to do with memory usage, but in any case is not going to have the same fix as what has been made here.

Note that #1298 is ongoing about a recent memory issue, with some progress made, but the discussion of that issue should happen there, not here. If you can reproduce your issue of the analysis never completing, please do open another issue so we can look into it. Thanks.

@HusseinBakri
Copy link

HusseinBakri commented Mar 14, 2020

I love using VSCode with Python extension but I am having lately the same issues presented here. VSCode with Python extension cosumes massive amount of memory rendering the system unusable! It is consuming 5GB of RAM! The consumption is in some cases even higher than that. I am working on a laptop with only 8GB of RAM. I am using Windows 10, latest version of Python 3.8, VSCode is updated to latest version (March 2020). I mean this is ridiculous! Any solution yet!

@MikhailArkhipov
Copy link

Couple things. First, I'd suggest opening separate issue rather that posting in a closed thread. Each issue is different, there is not much of a common ground - i.e. there is no 'bug' that explains memory footprint in all cased.

Second, it depends what libraries do you use. Some libraries are very large, literally thousands of modules. Multiple data science libraries yield tens of thousands of files. As with other typeless languages, figuring out what particular function might be returning in Python, or what type some variable might have becomes quite expensive. Without type annotations or stubs it literally means walking though every code line in a library looking for what particular code branch might be returning or what members it might be adding to a class.

@microsoft microsoft locked as resolved and limited conversation to collaborators Mar 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests