-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCS/ASYNC: Skip released handler in poll missed process #9942
base: master
Are you sure you want to change the base?
Conversation
@jeynmann, pls fix commit titles to trigger CI. https://github.com/openucx/ucx/wiki/Guidance-for-contributors |
db814db
to
a1408aa
Compare
pls check CI failures. Coverity error looks relevant:
|
@evgeny-leksikov |
src/ucs/async/async.c
Outdated
ucs_mpmc_queue_for_each(elem, &async->missed) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ucs_mpmc_queue_for_each(elem, &async->missed) | |
{ | |
ucs_mpmc_queue_for_each(elem, &async->missed) { |
src/ucs/async/async.c
Outdated
ucs_mpmc_queue_for_each(elem, &async->missed) | ||
{ | ||
if (ucs_async_missed_event_unpack_id(elem->value) == handler->id) { | ||
elem->value = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a constant for special value (-1)
test/gtest/ucs/test_async.cc
Outdated
void create_event() { | ||
ucs_status_t status = ucs_async_pipe_create(&m_event_pipe); | ||
ASSERT_UCS_OK(status); | ||
EXPECT_EQ(ucs_atomic_cswap32(&m_event_valid, 0, 1), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here and below: expected value goes first in arg list, it makes matter in output of failed test
EXPECT_EQ(ucs_atomic_cswap32(&m_event_valid, 0, 1), 0); | |
EXPECT_EQ(0, ucs_atomic_cswap32(&m_event_valid, 0, 1)); |
test/gtest/ucs/test_async.cc
Outdated
le->set_event(); | ||
suspend(20 - wait); | ||
le->check_miss(); | ||
++round; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this value used anywhere?
src/ucs/async/async.c
Outdated
async = handler->async; | ||
if (async != NULL) { | ||
ucs_mpmc_queue_block(&async->missed); | ||
ucs_mpmc_queue_for_each(elem, &async->missed) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add ucs_mpmc_queue_for_each
in .clang-format
(ForEachMacros
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe instead of exposing ucs_mpmc_queue_lock, ucs_mpmc_queue_unlock and foreach, can add ucs_mpmc_remove_if(mpmc, predicate_cb, arg) to take a callback that returns 0/1 whether the given element should be removed?
src/ucs/datastruct/mpmc.h
Outdated
/** | ||
* Lock MPMC queue. | ||
*/ | ||
void ucs_mpmc_queue_block(ucs_mpmc_queue_t *mpmc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename no lock/unlock?
test/gtest/ucs/test_async.cc
Outdated
@@ -811,6 +889,19 @@ UCS_TEST_SKIP_COND_P(test_async_event_mt, multithread, | |||
EXPECT_GE(min_count, exp_min_count); | |||
} | |||
|
|||
UCS_TEST_SKIP_COND_P(test_async_event_mt, multithread_create_destory, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: multithread_create_destroy
test/gtest/ucs/test_async.cc
Outdated
@@ -110,6 +126,7 @@ class base_event : public base_async { | |||
} | |||
|
|||
ucs_async_pipe_t m_event_pipe; | |||
volatile uint32_t m_event_valid = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: maybe init it in ctor
test/gtest/ucs/test_async.cc
Outdated
EXPECT_EQ(0, ucs_atomic_cswap32(&m_event_valid, 0, 1)); | ||
} | ||
|
||
void destory_event() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: destroy_event
void reset() { | ||
ucs_async_pipe_drain(&m_event_pipe); | ||
} | ||
|
||
void create_event() { | ||
ucs_status_t status = ucs_async_pipe_create(&m_event_pipe); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: What happens when we call this function on already "created" event?
Shouldn't we check if object is already initialized, and if so - release the previous state first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, double initialization is an error.
The life time of event is managed by base_event
, since we call create/destroy method inside the constructor/destructor.
base_event(ucs_async_mode_t mode) : base_async(mode) {
create_event();
}
...
virtual ~base_event() {
destroy_event();
}
To avoid double initialization, I added assertion inside the create_event
method, see
EXPECT_EQ(0, ucs_atomic_cswap32(&m_event_valid, 0, 1));
In addition, if we explicitly call create_event
, which means we want to manage the event by ourself, then we're responsible for destroy_event
.
test/gtest/ucs/test_async.cc
Outdated
check_is_blocked(le, false); | ||
|
||
int result = le->count(); | ||
delete le; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls consider using std::scoped_ptr to get rid of explicit delete
You can also just return le->count();
in this case without saving in tmp var
src/ucs/async/async.c
Outdated
async = handler->async; | ||
if (async != NULL) { | ||
ucs_mpmc_queue_block(&async->missed); | ||
ucs_mpmc_queue_for_each(elem, &async->missed) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe instead of exposing ucs_mpmc_queue_lock, ucs_mpmc_queue_unlock and foreach, can add ucs_mpmc_remove_if(mpmc, predicate_cb, arg) to take a callback that returns 0/1 whether the given element should be removed?
@jeynmann pls force push the last PR to fix the commit title |
fa48d26
to
d90ff5c
Compare
/azp run |
Azure Pipelines successfully started running 4 pipeline(s). |
@jeynmann pls squash |
UCS/ASYNC: avoid call handler if already remove GTEST/UCS: rename test func name GTEST/UCS: use ucs rand UCS/ASYNC: fix test code + format GTEST/UCS: fix format UCS/ASYNC: fix code according to comments UCS/ASYNC: fix code according to comments UCS/ASYNC: fix code according to comments UCS/ASYNC: fix code according to comments UCS/ASYNC: fix code according to comments UCS/ASYNC: fix code according to CI UCS/ASYNC: fix code according to CI UCS/ASYNC: fix code according to CI UCS/ASYNC: fix code according to comments UCS/ASYNC: fix code according to comments UCS/ASYNC: fix code according to comments GTEST/UCS: fix code according to comments
a0e60bc
to
b9c293b
Compare
What
Skip events in poll missed process if its handler has already been removed.
Why ?
Async handler might be removed while events missed.
When the handler be called in poll missed process and same memory is reused by the other worker, the handler will try to access wrong resources.
How ?
Set MPMC queue element's value to -1 if handler removed, then skip the events in poll missed process.